





THE INTERNATIONAL SERIES OF MONO- 
GRAPHS ON PHYSICS 

GENEBAL EDITORS 

K. H. FOWLER P. KAFITZA 

F.R.S., Fellow of Trinity College, Cam- F.R.8., Din^ctorof tho Institute fur IMiy- 
bridge, Professor of Applied Mathe- sical Problems, Mow v)w; lately rull<nvt)f 
maSs in the University of Cambridge. Tnnity College, Catiibridge, ami Krjyal 

Hocioty Mossel llcjseareh Profeswjr. 

Already Published 

THE THEORY OF ELECTRIC AND MAGNETICJ SUSCEPTIBILI- 
TIES. By J. H. VAN vXiEOK. 1932. Royal 8vo, pp. 39(1. 

WAVE MECHANICS. ELEMENTARY THEORY. By j. FUKNKwr, 
Second Edition, 1936. Royal 8vo, pp. 322. 

WAVE MECHANICS. ADVANCED GENERAL THEORY. By ,r. 
PRBNKEiL. 1934. Royal 8vo, pp. 533. 

THE THEORY OF ATOMIC COLLISIONS. By n. f. Morr mid n, h. w. 
MASSEY. 1933, Royal 8vo, pp. 300. 

RELATIVITY, THERMODYNAMICS, AND COSMOLOGY. I^y n. c. 
TOLMAN. 1934. Royal 8vo, pp. 518. 

ELECTROLYTES. By hans falkbnhagen. TraiiHlatod by u, i». hkIiL. 
1934. Royal 8vo, pp. 364. 

CHEMICAL KINETICS AND CHAIN REACTIONS. By N, hkmk- 
NOPP. 1936. Royal 8vo, pp. 492. 

THE PRINCIPLES OF QUANTUM MECHANICS. By p. a. m. niuAr. 
Second Edition, 1935. Royal 8vo, pp. 312. 

RELATIVITY, GRAVITATION, AND WORLD-STRUCTUHE. By 
B. A. MILNE. 1936. Royal 8vo, pp. 378. 

THE QUANTUM THEORY OP RADIATION. By w. heitlkh. 
1936. Royal 8vo, pp. 264. 

THEORETICAL ASTROPHYSICS: ATOMIC THEORY AND THE 
ANALYSIS OP ATMOSPHERES AND ENVBLOPEvS. By h. 
BOSSELAND. 1936. Royal 8vo, pp. 376. 

THE THEORY OP THE PROPERTIES OF METALS AND ALLOYS. 
By N. p. MOTT and H. jones. 1936. Royal 8vo, pp. 340. 

STRUCTURE OP ATOMIC NUCLEI AND NUCLEAR TRANS- 
PORMATIONS. By a. gamow. Being a second edition of *Constitu« 
tion of Atomic Nuclei and Radioactivity*: 1937. Royal 8vo, pp. 282. 

ECLIPSES OF THE SUN AND MOON. By sib fkank dyson and 
B. v. d. B. WOOLLEY. 1937. Royal 8vo, pp. 168, 



THE 

INTEKNATIONAL SEKIES 

OF 

MONOGRAPHS ON PHYSICS 

QUNBBAI. BDITOBS 

R. H. FOWLER and P. KAPITZA 



OXFORD UNIVERSITY PRESS 

AMEN HOUSE, E.O. 4 
LONDON EDINBtJBOH GLASGOW NEW YORK 
TORONTO MBLBOUENB CAPETOWN BOMBAY 
CALCUTTA MADRAS 

HUMPHREY MILFORD 

PUBLISHER TO THE UNIVERSITY 



THE PRINCIPLES OE 
STATISTICAL 
MECHANICS 

BY 

RICHARD O. TOLMAN 

PROrKSSOB OR FHYHIOAIi OHRMIBIBY AMD HAXHKMATIOAI. 
PHYSIOS AT THH OADIVOBMIA INSTITDTM OF 
TROHNOI.OOY 


OXFORD 

AT THE CLARENDON PRESS 
1938 



ranSTCBD m QRBA® BEITAIN 



TO 

MY FRIEND AND OOLLBAGUE 

J. ROBERT OPPENHEIMER 

WITH WHOM ALL PARTS OF 


THIS BOOK HAVE BEEN 
DISCUSSED 




PREPACE 

Ik the preparation of this book it has been my hope and purpose 
to provide an exposition of the underlying principles of statistical 
mechanics which would give a reasonably clear and complete picture 
of the additional hypotheses that must be introduced and of the further 
methods that must be developed, both in the case of the classical and 
of the quantum mechanics, when it is desired to give appropriate treat- 
ment to the behaviomr of mechanical systems in states that are less 
precisely specified than would be theoretically possible. It is even 
hoped that the present book may in some measiure meet, from a modem 
point of view, the same needs for a fundamental exposition as were 
originally so successfully filled, for the classical mechanics, by the 
treatment given by Gibbs in his EUmmtary Principles m Statistical 
Mechanics. Throughout the book, although the work of earlier investi- 
gators will not be neglected, the deeper point of view and the more 
powerful methods of Gibbs will be taken as ultimately providing the 
most satisfactory foundation for the development of a modem statisti- 
cal mechanics. 

The present work is in no sense a revision or second edition of my 
earlier book. Statistical Mechanics vMh Applications to Physics and 
Chemistry, a book which was written at a time when the older quantum 
theory was being replaced by modern quantum mechanics, and a book 
which is now both out of date and out of print. Applications to actual 
physical-chemical systems are included in the present book only in so 
far as is desirable for illustrating the nature of the fundamental prin- 
ciples of statistical mechanics and the methods to be employed in 
applying them. The needs for a complete account of the manifold 
applications of statistical mechanics to physics and chemistry should 
be met for a long time by the exhaustive treatment given by Fowler 
in the second edition of his Statistical Mechanics. 

In writing the book I have received helpful criticisms and suggestions 
&om many physicists and chemists. In particular I must mention my 
gratitude in this connexion to my colleagues, Professors W. V. Houston, 
D. M. Yost, L. 0. PAuling, R. M. Badger, R. G. Dickinson, J. H. 
Stmrdivant, and E. T. Bell, of the California Institute of Technology, 
as well as to Professors R. H. Fowler of Cambridge, W. Pauli of Ziirich, 
H. P. Robertson of Princeton, E. Bright Wilson of Harvard, E. H. 
Kennard of Cornell, and J. T. Hildebrand of California, with all of 
whom I have discussed one or another or many questions. 

86W.25 b 



PREFACE 


Above all, however, I must express my indebtedness to my friend 
and colleague. Professor J. R. Oppenheimer, of the University of { Cali- 
fornia and of the California Institute, with whom I have eomi»letely 
discussed both specific details and general points of view as the manu- 
script was being prepared, and from whom I have received both minor 
suggestions and major enlightenment. This indebtedness to Professor 
Oppenheimer is so great that it can only be partially repaid by the 
dedication of the book which he has been willing to accept. 

PASADENA, OALIPOBNIA R. (.!, T. 



CONTENTS 


Part Onk. THE CLASSICAL STATISTICAL MECHANICS 
I, INTRODUCTION 

§ I . The Nature of Statistical Mechanics . . . .1 

§ 2, The Classical Statistical Mechanics . , . . .4 

§ 3. The Quantum Statistical Mechanics . . , .6 

§ 4. Statistical Mechanics and ThormodynarnioM , , .9 

§ 5. Points of View and Methods of Presentation , . .10 

II. THE ELEMENTS OF CLASSICAL MECHANICS 

§ 0. Introduction . . . . . . .16 

§ 7. State of a System. Conoralizod Coordinates and Velocities , .17 

§ 8. Hamilton’s Principle and the Lagi^angian Function . . .10 

§ 9. The Equations of Motion in the Lagrangian Form . . .23 

§ 10. Generalized Momenta, the Hamiltonian Function, and the C-aiionical 

Equations of Motion . . . . . .25 

§11. The Change in Quantities with Time. Poisson Brackets . , 27 

§12. The Integral of Energy and the Interpretation of tho Hamiltonian 

Function . . . . . . ,28 

§ 1 3. The Integrals of Linear and of Angular Momentum . . 30 

§14. Canonical Transformations . . . , . ,32 

§ 15, Integration of the Equations of Motion by Transformation to Cy<5lic 

Coordinates. Hamilton’s Characteristic Fimction . . 37 

§ 16. Integration of tho Equations of Motion by Transformation to Con- 
stant Coordinates and Momenta. Hamilton’s Principal Ftmetion . 39 

III, STATISTICAL ENSEMBLES IN THE CLASSICAL MECHANICS 

§ 17. Ensemble and Phase Space . . . . .43 

§18. Density of Distribution in tho Phase Space, Averages for tho En- 
semble ........ 45 

§19. Liouvillo’s Theorem for the Change in Density with Time . , 48 

§ 20. Invariance of Density and of Extension to Canonical Transformations 52 

§ 21. Conditions for Statistical Equilibritim . . . .56 

§ 22. The Uniform, Microcanonical, and Canonical Ensembles . , 56 

(a) The Uniform Ensemble . . . . .56 

(5) Tho Microcanonical Ensemble . . . .57 

(c) Tho Canonical Ensemble . , . . ,58 

§ 23* The Fundamental Hypothesis of Equal a priori Probabilities in the 

Phase Space 59 

§ 24. System of Interest and Representative Ensemble . . .62 

§ 26. Validity of Statistical Mechanics . . . . ,63 

IV. THE MAXWELL-BOLTZMANN DISTRIBUTION LAW 

§ 26. The Microcanonical Ensemble as Representing a System in Equilibrium 7 1 

§ 27. Specification of Condition for a System of Many Similar Molecules . 74 

§ 28. The Probabilities for Different Conditions of the System . . 78 

§ 29. Condition of Maximum Probability. Maxwell-Boltzmann Distribu- 
tion Law ....... 


36W.26 1)2 


79 



CONTENTS 


§ 30. Maxwell-Boltzmaim Distribution for Molecules of Moro Hisii h iSin«ln 


Kind 

§ 31. Re-expression of Maxwell-Boltzmann Law in Difforonl.ial Kr^rin . «3 
§ 32. Evaluation of Constants in the Maxwell-Boltzmann Dist ribution Uvw Hi 

(а) Value of Constant ot (or (7) . - . . . H*i 

(б) Introduction of the Idea of a Perfect Gas Then uoirif 'for . Hfi 

(c) Value of Constant j8 . . . . . . H7 

§ 33. Useful Forms of Expression for the Distribution I^aw . , MK 

§ 34. Mean Values Obtained from the Distribution Law . , . 

§ 35. The General Principle of Equipartition • . . . Oft 


V. COLLISIONS AS A MECHANISM OF (?HAN(iK Wri'H Tmv: 


§ 36. Introduction ...... .lift 

§ 37. The Principles of Dynamical Reversibility aiul HeUeciability . 102 

(а) The Principle of Dynamical Reversibility , . ,102 

(б) The Principle of Dynamical Refleotability . . ,104 

§ 38. Molecular States . . . . , . . H15 

(a) Specification of Molecular States Appropi‘iai-c> for t hr < ’onsiflrra - 

tion of Collisions . . , . , , 10f> 

(b) Classification of Molecular States .... 100 

§39. Molecular Constellations . . , . . . lOH 

§ 40. Molecular Collisions . , , , , .110 

§ 41. The Closed Cycle of Corresponding Collisions . . .114 

§ 42. The Closed Cycle of Two Members in the Case of Spheri<*iil MoIihmjIivh 1 17 
§ 43. Application of Conservation Laws to Collisions , . .120 

§ 44. AppHcation of Liouville’s Theorem to a Collision . . ,12! 

§ 45. The Probability Coefficients for Collisions . . , .127 

§ 46. Concluding Remarks on Molecular Collisions . . .182 


VI. BOLTZMANN’S H-THEOEEM 
§ 47. Definition of the Quantity H . 

§ 48. Derivation of the H-theorem .... 

(a) Rate of Change of E with Time 

(b) Effect of Collisions on Change of H with Time , 

(c) Effect of Other Processes on Change of H with Time , 

§ 49. Discussion of the H-theorem 

(a) Statistical Character of the H-theorem 

(b) Observations on the Continued Decrease of iif’with Time 

(c) H-theorem and the Principle of Dynamical Reversibility 

(d) H-theorem and the Occurrence of Continued Fluctuations 
§ 50. H-theorem and the Condition of Equilibrium 

(a) MaxweU-Boltzmann Distribution when H is a Minimum 

(b) Steady Condition when H is a Minimum 

(c) Detailed Balance when H is a * ] 

§ 61. The Generalized JST-theorem 

(а) Definition of the Quantity g for an Ensemble [ 

(б) Two Necessary Lemmas 

(c) Change in g with Time . ! [ * 

(d) Rdation between the Two Forms of H-theor^ ! 

(e) Concluding Remarks on the Generalized H-theorem, ] 


m 

186 

187 

142 

146 

J46 

148 

152 

155 

159 

150 

J60 

161 

165 

165 

168 

170 

174 

177 



CONTENTS 


Part Two. THE QUANTUM STATISTICAL MECHANICS 
VII. THE ELEMENTS OF QUANTUM MECHANICS 

A. Historical Remarks 


§ 52. The Necessity for Modifying Classical Ideas , . .180 

(а) Discrete Energy Levels . . . . .181 

(б) Wave-particle Duality . . . . .182 

(c) Uncertainty, Complementarity, and Indetormination . .184 

(d) The Correspondence Principle . . . .187 

(e) Plan of Treatment . . . . . .188 

B. The Postulates 

§ 63. The Existence of Probability Densities and Amplitudes . .189 

§ 64. The Interrelation of Probability Amplitudes . . .193 

§ 66. The Operators Corresponding to Observable Quantities and their Use 

in Calculating Expectation Values . . . .196 

(a) Preliminary Discussion . . . . .196 

(b) Operator Manipulation ..... 199 

(c) Linear Operators . . . . . .201 

(d) Hormitian Operators ...... 201 

(e) The Operators q and p . . . . . 204 

(/) The Operators Con’esponding to Obsorvablo Quantities in 

General ....... 200 

(g) The Calculation of Expectation Values in General . . 208 

§ 60. The Schroedingor Equation for Change in State with Time . . 209 

(a) Postulated Form of Schroedingor Equation . . .210 

(b) Somo Specific Examples of the Schroedingor Equation . 212 

(c) Transformation of Schroedingor Equation from Coordinate to 

Momentum Language . . . . .216 

§ 67. Summary of Postulatory Basis , . - . .217 

C. Theorems Illustrating the Nature of Quantum Mechanicjs 

§ 68. Probability Density and Probability Current . . .218 

{a) The Conservation of Total Probability . . .218 

(b) The Concept of Probability Current .... 219 

§ 69. The Principle of Supoiposition ..... 220 

§ 60. Energy Levels for Systems in Steady States. Eigenvalues and Eigen • 

functions . , . . . . .222 

§ 01. Wave-particle Duality, Do Broglie Waves for Free Particles . 226 

§ 62. The Heisenberg Uncertainty Relation . . . . 231 

(а) Case of a Free Particle, Wav© Packets . . ,231 

(б) General Treatment of Uncertainty Relations . . . 234 

§ 63. Correspondenco between Classical and Quanttim Mechanical Results 237 

(а) Change in Expectation Values with Time . . . 237 

(б) The Analogue of the Hamiltonian Equations of Motion . 239 

(c) The Conservation of Energy in Quantum Mechanics . . 240 

(d) The Conservation of Momentum in Quantum Mechanics . 241 

(fi) Approach of Quantum Mechanical Behaviour to the Classical 

Limit ....... 243 



xiv 


CONTENTS 


D. Ftjbthbr Development op Quantum Meohantoat. Mkthops. 
Transfobmatton Theory 


§ 64. Characteristic States. Eigenvalues and Eigenfunctions in < . 

(а) Equation Determining a Cliaractoristic State . 

(б) Eigenvalues and Eigenfimctions Corresponding <o C’hnnir^ 

teristic States . . . • 

(c) Properties of Eigenvalues and Eigenfunctions 

(d) States Characteristic of More Than One Observable . 

(e) Eigenvalues and Eigenfunctions for the Coordiiuitf^H and 

Momenta 

§ 66. Expansions in Terms of Eigenfunctions . . * . 

§ 66. Expansion of the Probability Amplitude t) * . . 

(а) Expansion at a Given Time of Interest 

(б) Expansion as a Function of the Time 

(c) Special Case of Expansion in Energy Eignnfun<!t ions * 

§ 67. Transformation Theory ..... 

(a) Probability Amplitudes in General .... 
(h) Operators in General ..... 

(c) The Schroedinger Equation in General 

(d) The Hermitian Matrices Corre^ondiiig to Obsorvnhlo Qtiiuit if ios 

(e) Unitary Transformations between Differtnit uin MiM'hnui * 

cal Representations ..... 

(/) Concluding Remarks on the General Qtuiniutn Moo)utni<*}d 
Language Provided by the Transformation 1'heory 
§ 68. The Method of Variation of Constants .... 

(а) Derivation of the Differential Equations 

(б) Approximate Integration for a Special Case . 


247 
24 H 
251 

25H 

264 
257 
257 
250 
251 ) 
261 
26 i 
263 
201 

265 

26S 

271 

273 

273 

270 


Vin. SOME SIMPLE APPLICATIONS OF QUANTUM MEl'IfANICS 


§ 69. Simple One-dimensional Solutions ..... 27H 

(а) Solutions in Regions of Constant Potential . . , 27K 

(б) Approxnnate Solutions in Regions of Varying Potential , 2H2 

§ 70, Particle in Free Space ..... , 285 

§'71. Particle in a Container . , , , . , 2H7 

(а) The Energy Eigenvalues and Eigenfunctions . . . 287 

(б) The Number of Eigensolutions in a Given Range cif Energy . 290 

§ 72. Particle in a Hooke’s Law Field of Force . . , * . 291 

§ 73. Particle in a Central Field of Force .... 292 

(а) Operators for the Components of Angular Mom(*nta«i , 293 

(б) The Eigenfunctions and Eigenvalucis Coirtisponding to Angiiliir 

Momentum - . . . , . 294 

(c) Steady States in a Central Field of Force . . . 297 

§ 74. Two Interacting Particles . . . , . . 299 

(a) Separation into External and Intomal Equations . , 300 

(b) Separation of Internal Equation in Casfb of ( ^«itral Korres . 30l 

(c) Solutions of the Separate Equations , , , 302 

(d) Indicated Nature of Applications . . ! ! 304 

§ 76. Particles with Spin * 30<J 

(a) The Spin Variables and Operators . , , , 306 

(b) Applications Involving Spin . . . , , 310 



CONTENTS 


XV 


§ 70. Systems of Two or More Like Particles .... 312 

(a) Symmetric and Antisymmetric Solutions for Pairs of Like 

Particles ....... 313 

(b) Properties of tlio Symmetric and Antisymmetric Solutioas . 314 

(e) Treatment of More Than Two Like Particles . . . 318 

(d) Further Properties of Symmetric and Antisymmetric Solutions 

— Case of Small Interaction between Particles. Pauli Exclu- 
sion Principle . . . . . .319 

(e) Enumeration of Eigensolutions . , . .321 

IX. STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS 

§ 77. Introduction of Statistical Methods in the Classical and in the 

Quantum Mechanics ...... 326 

§ 78. The Density Matrix in Quantum Statistical Mechanics . . 327 

§ 79. Transformation of the Density Matrix from One Quantum Mechanical 

Language to Another ...... 330 

§ 80. Density Matrix Corresponding to a Pure State . - . 333 

§81. The Analogue of Liouvillo’s Theorem in Quantum Statistical Mechanics 335 

(a) Time Dependence of Density Matrix in Language Provided by 

True Energy States ..... 336 

(h) Time Dependence of Density Matrix in Language Provided by 

Unperturbed Energy States .... 336 

(c) Time Dependence of Density Matrix in General Languages . 338 

§ 82. Conditions for Statistical Equilibrium .... 339 

(а) Density a Constant ...... 339 

(б) Density a Function of a Constant of the Motion . . 340 

§ 83. Tho Uniform, Microcanonioal, and Canonical Ensembles in the 

Quantum Mechanics ...... 342 

(a) The Uniform Ensemble ..... 342 

(6) Tho Microcanonioal Ensemble .... 345 

(c) The Canonical Ensemble ..... 347 

§ 84. Tho Fundamental Hypothesis of Equal a priori Probabilities and 
Random a priori Phases for tho Quantum Mechanical States of a 
System ........ 349 

§ 85. Validity of Statistical Quantum Mechanics .... 36(i 

X. THE MAXWELL-BOLTZMANN, EINSTETN-BOSE, AND 
FERMI-DIRAO DISTRIBUTIONS 

§ 8(J. 'I'ho Microcanonioal Ensemble as Representing a System in Equili- 
brium ........ 302 

§ 87. Specification of Condition for a System Composed of Weakly Inter- 
acting Elements ....... 304 

{a) Relation of Eigensolutions for System to Eigensolutions for its 

Component Eltanents ..... 304 

(b) Method of Specifying Different Conditions . . . 307 

(c) Number of Eigenstates CoiTosponding to a Specified Condition 

of tho System ...... 308 

§ 88. Tho Probabilities for Different Conditions of tho System . . 370 

§89. Condition of Maximum Probability. The Three Distribution Laws . 372 



XVI 


CONTENTS 


§ 90. Distaibution in Systems Containing Coiistituent ElfUH'nts of 
Than One Kind 

§ 91. Evaluation of Constants in the Distribution Law.s . 

(а) Value of Constant a . 

(б) Value of Constant . 

§ 92. Maxwell-Boltzmann Systems ..... 

(a) Mean Energy of Oscillators of I'requoney r . 

(5) Application to the Specific Heat of Hoi ids 
(c) Application to Radiation ..... 

§ 93. Einstein-Bose Systems ....*• 

(a) Application to Radiation ..... 

(6) Useful Integrals in the Einstein-Bose Case 
(c) Values of Parameter ol. Energy B, and ProHsiirf' p 

. \d) Case of Slight Degeneration ..... 

§ 94. Fenni-Dirac Systems ..... 

(а) Useful Integrals in the Fermi-Dirac Ctim^ 

(б) Values of Parameter a, Energy B, and ProHsun* p 
(c) Case of Slight Degeneration ..... 

\d) Case of Extreme Degeneration .... 

(e) Remarks on Applications to Conduction Elc^ctrons 

XI. THE CHADTGE IN QUANTUM MECHANICAL SYSTEMS WITO 'flME 

§ 95. Dynamical Reversibility in the Quantum M^ndtanicH . , 3!>r> 

§ 96. Integration of Schroedinger Equation for Chang<*K with Tirno in mi 

Isolated System ....... 399 

{a) Introduction ....... 399 

(6) Expansion of State in Terms of Unperturbod KntTgy 

states and Integration by the Method of Variiition of (Vm- 
stants ....... 4tM) 

(c) Expansion of State in Terms of General KigonfunotiouH and 

Integration as a Taylor’s Series in th<^ Time . . 493 

(d) Change with Time Regarded as a Unitary TrannformatKjn . 49ft 
{e) Application to the Calculation of ProbabilitioH m a Fttiadinn 

of Time ....... 497 

§ 97. Integration of Schroedinger Equation when an External ParameliT 

is Varied ....... 499 

(a) Probability Amplitudes for Energy Htates ilmt Depend on an 

External Parameter ..... 499 

(b) Gradual Change in Parameter . . . .412 

(c) Abrupt Change in Parameter , . , ,414 

§98. Observation and Specification of State in Studying Uie (!|»ange itf 

Quantum Mechanical Systems with Time . . .410 

(а) Complementarity Restrictions on Observations . .410 

(б) Approximate Specifications of State in the Quant\im MiHdiatnuK 4 1 H 
(o) Approximate Specification of Unperturbed Enorgy EigouHt ati^H 429 

(d) Approximate Specification of Eigenstates in General , . 422 

(e) Remarks on the Assignment of Equal Probabiliti(»H and Itaii- 

dom Phases 423 

§ 99. Time-Proportional Transitions . - , . , 424 

(a) Transition from a Discrete State to a ContinuouH Hp(‘<! truiii nf 




374 

375 
370 
370 
37H 
378 
389 
389 
.381 
382 
3H3 
384 
380 
388 

388 

389 
399 

391 

392 



CONTENTS 


xvii 


(6) Transition from the Continuous Spectrum back to the Discrete 

State . . . . . . .428 

(c) Transition from One Group of Continuous States to Another 431 
{(1) General Formulation of Transition Probabilities . . 434 

§ 100. Tlio Probabilities for Transition by Collision in Fermi-Dii-ac and 

Einstoin-Bose Gases ...... 436 

(a) Perturbation Matrix for tho Interaction of Fermi-Dirac 

Particles ....... 437 

(h) Pertiurbation Matrix for the Interaction of Einstein-Boso 

Particles ....... 44I 

(0) Time-proportional Collision Probabilities . . . 444 

§ 101. G(moral Treatment of Changes in Ensembles with Time . . 460 

XII. THE QUANTUM MECHANICAL K-THEOREM 
A. Dbeivation op Theorem 

§102. Dc'fmition of /f for a Gas ...... 463 

§ 1 03. (Change of H with Time as a Result of Collisions . . . 465 

§ 104. Definition of S for a Representative Ensemble of Systems , 469 

(а) Fine-grained and Coarse-grained Probabilities in the Quantiun 

Mechanics • • • • • . 469 

(б) General Expressio^for H . . . . . 460 

(0) Rola^on between H and H , . , . , 462 

§ J 05. Change of H with Time by the Method of Transition Probabilitic^s . 463 

ij 106. (Jhango of H with Time from the Exact Integration of the Sohroe- 

dinger Equation ...... 466 

(а) Tho Representative Ensemble .... 466 

(б) Needed Results of the Exact Integration of Schi'oedinger’s 

Equation ....... 467 

(c) Tho Klein Relation as a Necessary Leirnna . . . 468 

(d) Derivation of tho Generalized JT-theorem . . , 470 

(c) Further Discussion of the Generalized JBf -theorem . . 476 

§ 107. Application of H-theorem to Interacting Systems . . . 477 

B. Relation op J¥-thborem to Behaviour at Equilibrium 

§ 108. Relation to Previous Studies of Equilibrium . . . 480 

§ 109. Tho Long Time Behaviour of Ensembles Representing Perfectly 

Isolated Systems ...... 482 

§ 110. Tho Microcanonical Ensemble as Representing Equilibrium for a 

Perfectly Isolated System . . . * . 486 

§111, The Long Time Behaviour of Ensembles Representing Systems in 

Contact with their Surroundings .... 488 

(а) Probabilities for States of the Combined System . . 488 

(б) Number of States of Surroundings as a Function of Energy . 490 

(0) Probabilities for States of the System Proper . . 494 

(d) Tho Concepts of Thermal Equalization, Essential Isolation, 

and Essentially Adiabatic Process . . . 498 

§ 112. The Canonical Ensemble as Representing Equilibrium for a System 

in a Heat Bath or in Essential Isolation . . . 601 

(d) Tho Canonical Distribution ..... 601 

(6) Justification for the Canonical Ensemble . , . 603 



xviii 


CONTENTS 

0. Spboifio Examples of KqmunmvM 


§ 113 . Equilibrium in Maxwell-Boltzmann SystoniH ^ . 

§ 114. Equilibrium in Binstein-Bose and Formi -Dirac. (Jiihcs 

(а) Derivation of the Distribution Law.s 

(б) Investigation of Approximation . . . 

(c) Further Discussion of tho Einstein- Howe and Kcnni-Dira 
Distribution Laws . . . 

§ 1 15. Equilibrium in General in Physical-CiJininical Sysfems 
§ 116. The Principle of Detailed Balance in tho Quantum McchanirH 


PabtTheeb. statistical MECHANICS AND TlfKRMODYNAMirS 

Xm. STATISTICAL EXPLANATION OF THE J^HINCIPLES OF 
THERMODYNAMICS 


§117. Introduction » 

(а) Thermodynamic System and Rc^proHoniaf iv c EnHf*inblM , MU I 

(б) The Nature of Thermodynamic Variablc'H . . . ^>24 

(c) Energy, Work, and Heat ..... 52U 

§ 118 . The Energy i^inciple for Bnsomblos .... 52H 

§ 119. The Analogue of the First Law of ThnnnodynainicrH . . 52U 

§ 120. The Canonical Ensemble as Roprosonting Tht^rmodynamir K<inili* 
brium . . . . _ . . * . 

§ 121. Relation Connecting the Values of // in Noiglibouring < 'anniticfii 

Ensembles ....... 533 

§122. Statistical-Mechanical Analogues of Entropy, Tonifw^rnl uns and 

Free Energy ....... 535 

§ 123. EfEect on 3 of Leaving a System in Essential iHoiatioii . . 540 

§ 124. E^ect on 3 of Adiabatic Changes in External Coord iniit«<>H . rf41 

(а) Reversible Adiabatic Change in an Extoma! CV^ordiniM** . 542 

(б) Irreversible Adiabatic Change in an Extennal Cool’d in»iO» . 517 

§ 126. Effect on S^f Interaction in General . . . , 54!> 

§ 126. Lemma on ...... 550 

§ 127. The Direction of Thermal Flow as Dependent on $ , .551 

§ 128. Effects of Various Blinds of Thermal Process 

(а) Effect on B of Thermal Interaction in General 

(б) Effect on B of Thermal Transfer from a Canonical 

(c) Thermal Equilibrium as a Result of Sucoessive ContaiJtM . 554 

(d) The Limiting Case of Reversible Thermal Transfer . . 555 

§ 129. Carnot Cycle of Processes 555 

§ 130. The Analogue of the Second Law of Thermodynamics . . 55K 

§ 131. Remarks on the Statistical Explanation of Thermodynamics , 560 

XIV. FURTHER APPLICATIONS TO THERMODYNAMICS 

§ 132. Thermodynamic Quantities in Terms of the Free Energy * , 565 

§ 133. Thermodynamic Quantities in Terms of the Sum-over-etates . 567 

§ 134. Sum-over-states as Dependent on Molecular States , . 56H 

(a) Case of Permanently Distinguishable Elements . . 569 

(b) Case of Einstein-Bose and Fermi-Dirao Gases . . 570 



CONTENTS 


X13C 


§ 135. Perfect. Monatomic Oas ...... 672 

(<*) Sum-over-statos ..... . 672 

(b) Thermodynamic Quantities . . . . , 673 

(c) Evaluation of Constant A; . . . . . 674 

§ 136. Perfect Gases Composed of More Complicated Molecules . . 674 

(a) Sum-over-statcs ..... . 674 

(b) Thermodynamic Quantities ..... 676 

(c) Energy and Entropy of Actual Monatomic and Diatomic Gases 678 

§137. Crystals Composed of a Single Substance .... 683 

(а) The Modes of Vibration of a Crystal . . . 683 

(б) Application of Quantum Mechanics .... 686 

(c) Sum-ovor-states for the Crystal .... 687 

(d) Thermodynamic Properties of the Crystal . . . 689 

(e) Remarks on the Entropy of Crystals . . . 691 

§ 138. Mixtures of Substances . . . . . .695 

(a) Gaseous Mixtures ...... 695 

(5) Mixed Crystals ...... 698 

(c) Liquid Mixtures ...... 600 

(d) On the Definition of the Ideal Solution . . . 602 

§ 139. Vapour Pressures and Chemical Eqiiilibria . . . 604 

(a) The Thermodynamic Potentials of Crystals and of Gases . 604 

(6) Vapour Pressures of Crystals .... 606 

(c) Chemical Equilibria in Gases .... 609 

(d) On the Status of the so-called Third Law of Thermodyiiamics 612 

§ 140. Equilibrium between Connected Systems . . . .613 

(a) Thermodynamic Relations ..... 613 

(b) The Grand Canonical Ensemble .... 619 

(c) Correlation of Thermodynamic and Statistical-Mechanical 

Qtiantitios . . . . . . 622 

(d) Conditions for Equilibrium when Grand Canonical Ensembles , 

are Combined ...... 623 

(e) Explanation of the Gibbs Paradox .... 626 

§ 141. Fluctuations at Thermodynamic Equilibrium . . . 629 

(a) Fluctuations in the Case of Maxwell-Boltzmann, Einstein- 

Boso, and Fermi-^Dirao Distributions . . . 630 

(5) Fluctuations in Total Energy .... 631 

(o) Fluctuations in an External Force .... 635 

(d) Einstein’s Formula for Fluctuations in a Macroscopic Variable 636 

(e) Fluctuations in Composition .... 641 

(/) Fluctuations in Density of a Fluid .... 645 

§ 142. Conclusion ... .... 649 

APPENDIX I. Symbols for Quantities, Operators, and Matrices . 660 

APPENDIX II, Rome Useful Fonnulae. .... 665 

SUBJECT INDEX ... .... 657 

NAME INDEX ... .... 661 




I 

INTRODUCTION 


1 . The nature of statistical mechanics 
It is the purpose of this book to expound the fundamental principles 
of statistical mechanics. Only incidental discussion of the applica- 
tions of statistical methods to the problems of usual physical-chemical 
interest will be given. It is hoped, however, that the exposition of 
underlying assumptions and general methods will be sufficient to give 
a real insight into the physical nature and mathematical apparatus of 
statistical mechanics. 

The science of statistical mechanics has the special function of pro- 
viding reasonable methods for treating the behaviour of mechanical 
systems under circumstances such that our knowledge of the condition 
of the system is less than the maximal knowledge which would be 
theoretically possible. The principles of ordinary mechanics may be 
regarded as allowing us to make precise predictions as to the future state 
of a mechanical system from a precise knowledge of its initial state.f 
On the other hand, the principles of statistical mechanics are to be 
regarded as permitting us to make reasonable predictions as to the 
future condition of a system, which may be expected to hold on the 
average, starting from an incomplete knowledge of its initial state. 

Since our actual contacts with the physical world are such that we 
never do have the maximal knowledge of systems regarded as theoretic- 
ally allowable, the idea of the precise state of a system is in any case 
an abstract limiting concept. Hence the methods of ordinary mechanics 
really apply to somewhat highly idealized situations, and the methods 
of statistical mechanics provide a significant supplement in the direction 
of decreased abstraction and closer correspondence between theoretical 
methods and actual experience. Even in the case of simple systems of 
only a few degrees of freedom, where our lack of .maximal knowledge 
is not duo to difficulties arising from the complexity of the system, the 
methods of statistical mechanics may be applied to a system whose 
initial state is not completely specified.! 

t This is truo both for tho classical mochanics and for the qiiaxitum mechanics. In 
tho latter ooso, however, a knowledge of the precise state of the system is not sufficient 
to determine precise values for all idnds of quantities. See Chapter VII. 

t In the case of the classical mechanics, the possibility of applying statistical methods 
to a system of a small number of degrees of freedom was emphasized by Gibbs himself 
as well as by E. B. Wilson, Annals of Math, 10, 129, 149 (1909). In the case of the 
quantum mechanics, tho description of tho change in time of atomic systoDOLS in teroas 
8095.25 « 



2 


INTRODUCTION 


('hup. I 


In its historical development, however, the science <*(' statistn-Hl 
mechanics was speoiaUy devised for the treatment, of enmpli<‘nted sys^ 
terns, composed of such an enormous number of individual nu)lecul<«s 
that it would be too difficult to try to calculate the precise behaviour 
of the system as a function of time by the methods of ordinary 
mechanics. For example, in the case of a ga«s, consist, ing, say, of a large 
number of simple classical particles, even if we were given at some 
mitial time the positions and velocities of all the particles so that, we 
could foresee the collisions that were about to take place, if. is evi<ient. 
that we should be quickly lost in the complexities of our computations 
if we tried to follow the results of such collisions thremgh any extendetl 
length of time. Nevertheless, a system such as a gas composed of many 
molecules is actually found to exhibit perfectly definite regularities in 
its bdhiaviour, which we feel must be ultimately traceable to fhe laws 
of mechanics even though the detailed application of these laws defi<‘s 
our powers. For the treatment of such regularities in the behaviour 
of complicated systems of many degrees of freedom, the methods of 
statistical mechanics are adequate and especially ai)prnprittte. 

The general nature of the statistical mechanical procedure for (he 
treatment of complicated systems consists in abandoning the attempt, 
to foUow the precise changes in state that would take place In a parti- 
cular system, and in studying instead the behaviour of a collection or 
enaenMe of systems of similar structure to the system of actual interest, 
distributed over a range of different precise states. From a knowledge 
of the average bdtemour of the systems in a re2>reserUatwe ensemble, 
appropriately chosen so as to correspond to the partial knowledge that 
we do have as to the initial state of the system of interest, wo can then 
make predictions as to what may be expected on the avomge for the 
particular system which concerns us. 

The averages that we take for the properties of the systems in a 
representative ensemble may be mean values, or most probable values, 
or other average values as seems of interest.f The different kinds of 
average tend to coincide in the case of systems of large numbers of 
degrees of freedom. Furthermore, as we go to large numbers of dt^reos 

of the language of transition probabilities often actually involves statistical tnm’haniml 
methods even in the case of a system having only a few degrees of froodom, ns will ho 
seen in § 99. 

t The term emerage is used ttaoughout the book, in the usual sense of statisticians, 
M sigi^ying a single number which is chosen as giving a good representation of a ooiiec- 
tion of numbers. The averages that will usually interest us will be the arithmetical 
mean and the mode or most probable. The mean square and the root moan soMam, 
nowever, mhuI also ocoa^ionaJUy appear. 



§1 


NATURE OF STATISTICAL MECHANICS 


3 


of freedom, the average behaviour in an appropriately chosen repre- 
sentative ensemble is actually found to give a completely satisfactory 
account of the regularities of behaviour that are empirically known for 
individual systems. 

It may seem surprising that the methods of statistical mechanics 
should be so simple and effective as they actually prove to be. The 
substitution of a whole ensemble of systems of similar structure, in 
place of a single system that was too complicated to treat by ordinary 
mechanical methods, would not of itself lead to increased simplicity of 
treatment. The simplification is actually introduced by the circum- 
stance that the average behaviour of a suitably chosen collection of 
systems may be much easier to treat than the precise behaviour of a 
single complicated system. 

Similar situations are typical of biological phenomena. A given man 
of interest, 40 years old, is altogether too complicated a system, sur- 
rounded by too complicated an environment, to permit a precise com- 
putation of the date of his death. Nevertheless, we have the possibility 
of preparing very reliable tables which will give the average number 
of men, of any selected age, in any geographical region, who can be 
expected to die within a given length of time. Both in the physical 
and biological sciences an appeal to statistical methods may be taken 
when the complexity of the situation of interest demands it. 

The introduction of statistical methods in order to treat the behaviour 
of complicated systems may be carried out in two different ways which 
may be described by the terms dedmtive and ind/udive. A deductive 
introduction of statistical methods may be obtained by making a pre- 
liminary study of the nature of the system of interest and then intro- 
ducing what soom to be reasonable postulates which will permit the 
calculation of average behaviour. An inductive introduction of statisti- 
<ial methods can be obtained by studying the actual behaviour in many 
examples of the situation of interest, and then oodif 3 dng the average 
results obtained in order to make them available for use in further 
predictions. 

The two modes of procedure may be illustrated in connexion with 
the results to be expected on flipping a coin. Bor a deductive approach 
we could make a preliminary study of the nature of the coin, thus 
assuring ourselves, for example, that the coin was not loaded, and then 
introduce the postulate of equal o priori probabilities for coming up 
heads or tails; this would then permit us to calculate the average 
number of heads and average fluctuations therefrom that might be 



4 


INTRODUCTION 


( 'Imp. I 


expected in a thousand throws. For an inductive approiu^h wo could 
T«a.irA empirical determinations of the average results obtaine<l in groiijw 
of trials, and then use these for predictive purposes. B<ith niothod.s ciiu. 
of course, he applied to the same problem. As is faniilinr in other 
branches of scientific theory, the validity of the iKJstulatos int roduced 
in a deductive exposition can be subjected to some mcasui-o (»f observa- 
tional check. 

The actual development of statistical mechanics luts taken place 
along deductive lines. On the basis of the laws of ordinary nie<‘lunucs, 
reasonable postulates are introduced which are sufficient to permit 
computations of the average results which may be expected on actual 
experimentation. For the development of statistical mechanics in its 
final quantum form it has been found necessary to intnaluce a pair 
of assumptions, which may be designated together as the hypothesis of 
equd a priori probabilities amd of random a priori phmeH for the quantum 
states of a mechanical system. As is so often the cjise in theorcti<*Hl 
developments, this hypothesis is not itself suitable for dirtset (unpirical 
test. It has, however, the indirect check that is provided by tlu^ 
extensive and almost astounding agreement, which is atitually foumi 
between the deductive results obtained with its help and the ohsi’i-va- 
tional results obtained on experiment. 

2. The classical statistical mechanics 

In the first part of this book, Chapters II to VI, we sluill givui a 
brief exposition of the classical statistical mechanics. This will 1 h» of 
advantage because of the desirability of first grasping the concepts and 
methods of statistical mechanics in their historically origitial and simple! 
form, where the new kinds of principle involved can be mtwt clearly 
seen. It will also be of advantage since the results of tl»e cliwtsitsal 
development are still valid— in accordance with the correspondence 
principle — ^under limiting conditions where quantum deviations from 
classical findings can be neglected. There are, moreover, many problems 
where such limiting conditions prevail, and the classical point of view 
and treatment prove most satisfactory. 

In Chapter II, ‘The Elements of Classical Mechanics’, we give a brief 
treatment of the classical mechanics itself, to the extent nocossury to 
provide a basis for the development of the classical statistical mechanics, 
and also a basis for understanding the later change from the classical 
to the quantum mechanical point of view. We choose Hamilton’s 
principle as the primary postulate for the development, and then obtain 



§2 CLASSICAL STATISTICAL MECHANICS 6 

the equations of motion in the Lagrangian and canonical Hamiltonian 
forms. After treating the integrals of energy and momentum, we give 
a brief discussion of canonical transformations and the general theory 
of integration. 

In Chapter III, ‘Statistical Ensembles in the Classical Mechanics’, 
we turn our attention to statistical methods by giving general con- 
sideration to the properties and behaviour of collections or ensembles 
of systems, each having a structure similar to that of some actual system 
of intei’est, and each independently obe3dng the laws of classical 
mechanics, but distributed over a range of different states. Early in 
the chapter we derive the fundcmmtal theorem of Liouville which con- 
trols the temporal behaviour of ensembles. This permits us to discuss 
the conditions for statistical equilibrium, and to show the reason for 
introducing as the basic assumption of classical statistics the hypothesis 
of equal a priori probdbilities for the different dassicai states defined by 
equal volumes in the phase space corresponding to the system of 
interest. We take this principle as a reasonable postulate to introduce, 
but one to be ultimately justified, however, only by the correspondence 
between deduced and observational results. We conclude the chapter, 
nevertheless, with a discussion of the historical attempt to obtain a 
more direct justification of statistical methods with the help of the 
so-called ergodio hypothesis. 

In Chapter IV, ‘The Maxwell-Boltzmann Distribution Law’, we 
apply the methods of classical statistics to systems in equilibrium. To 
do this we regard a system in equilibrium with a specified energy as 
appropriately represented by an ensemble of systems, each having sub- 
stantially that energy, distributed in accordance with the principle of 
equal a priori probabiUties, the whole in statistical equilibrium. As a 
convenient average, wo then consider the most probable condition in 
such an ensemble, and are led in the case of systems composed of weakly 
interacting molecules to the Maxwell-Boltzmann distribution for the 
expected numbers of molecules in their different individual states. This 
is one of the most important consequences of the classical statistics, and 
the chapter will include some discussion of its significance and use. 

In Chapter V, ‘Collisions as a Mechanism of Change with Time’, we 
prepare the way for our later deduction of Boltzmann’s H-theorem as 
to the temporal behaviour of complicated systems, by studying the 
nature of molecular collisions as a primary agent leading actual physical 
chemical systems to change with time. We begin with a discussion of 
the general priwiple of dynamical reversibility in time, not only because 



INTBODUCTION 


of its special bearing on the possibility of reverse colllsioHH, but also 
because of its general bearing on the problem of tho actual irrev oi sil lilit 
found in the behaviour of physical systems when they are l(tf»k<*<l at 
from a gross, macroscopic point of view. We next classity the <lilT<*n‘iit 
iriTirla of moleculo/r states, constellations, and coUisiotis that can occur, 
and discuss the arrangement of corresponding collisiona in a cy<-licH! 
series. The consequences of Liouville’s theorem when apidied to a single 
collision are now investigated, and — ^making use of stat ist ical coiisiilera- 
tions— the relations between the probability coefficients for any siKH-ifietl 
collision and the collisions reverse and corresponding thoret«> arc then 
obtained. 

In Chapter VI, ‘Boltzmaim’s jy-theorem’, wo are ready to coiupletc 
the discussion that is needed to give a good insight into the classical 
statistics. With the help of the relations between tho probability ctsdli- 
cients for collisions, obtained in the preceding chapter, we derive Itoll/,- 
mann’s famous if -theorem for the probable decrease in // with time, 
and the corresponding approach of systems tow'ards a condition of 
equilibrium when they are left undisturbed. In the light' of this II- 
theorem, discussion is then given to the pTOhlem of the actual iriv- 
versibility in the behaviour of physical-chemical systems when io<»k(‘d 
at from a gross macroscopic point of view, and to tho problem of the 
continued fluctuations that must be expected around tin; u1t imat4> 
condition of equilibrium. We then use tho //-theorem to give a new 
discussion of the Maxwell-Boltzmann distribution os the etjuilibrhun 
condition which will be reached in course of time, ami give a disinission 
of the useful principle of deiaUed balance which holds under c(]uiiibrium 
conditions. We conclude the chapter with a discussion of Ihe more 
general formulation and more powerful treatment of the problem of 
approach towards equilibrimn which was discovered by (iibijs. 

3. The quantum statistical mechanics 

In the second part of the book, Chapters VII to XII, wo undertake 
the exposition of quantum statistical mechanics. This part of the 
book is much longer than the flrst part dealing with the classical 
statistics. In the first place, greater detail is needed since tho ideas of 
quantum mechanics, although intrinsically simple, must ho expresseil 
in less familiar and more intricate mathematical language than classical 
ones. In the second place, the introduction of quantum mechanics 
leads, as is well known, iu addition to the Maxwell-Boltzmann typo of 
statistical result, to the farther t 3 q)es of possible result associated with 



§ 3 QUANTUM STATISTICAL MECHANICS 7 

the names Kinstoin-Bose and Fermi-Dirac. Furthermore, the treatment 
of quantum statistics is appropriately made more complete than that 
of classical statistics, since quantum mechanics is the fundamental 
discipline which includes all the results of claasical mechanics as h'Tniting 
cases, valid when specific quantum effects can be neglected. 

In Chapter VIT, ‘The Elements of Quantum Mechanics’, we give 
a fairly complete treatment of the fundamental principles of non- 
rdativislic quantum mechanics. After discussing the necessity for modi- 
fying classical ideas, a rather full statement of the postulatory basis 
for quantum mechanics is presented, with the Schroedinger equation for 
the change in probability amplitudes with time playing a fundamental 
role. This is followed by the derivation of theorems which illustrate 
the general nature of quantum mechanical considerations. We then 
conclude the chapter by discussing the further development of quantum 
mechanical methods, including the ideas of the transformation theory. 

In Chapter VITT, ‘Some Simple Applications of Quantum Mechanics’, 
M'c continue the consideration of quantum mechanics itself in order to 
obtain specific results which will cither bo advantageous in throwing 
light on the nature of quantum theory or bo necessary for our later 
statistifial considerations. The applications include treatments of simple 
particles in various fields of force, of interacting particles, of particles 
with spin, and of systems containing particles having intrinsically 
similar properties. This last application is of special importance when 
wo use statistical mechanics to treat the bohaviom: of actual physical 
chemical systems containing many similar molecules. 

In Chapter IX, ‘Statistical Ensembles in the Quantum Mechanics’, 
wo can now turn our attention to statistical matters by giving general 
consideration to the properties and behaviour of ensembles of systems, 
each having a structure similar to that of some actual system of interest, 
and each independently obeying the laws of quantum mechanics, but 
distributed over a range of different quantum mechanical states. After 
discussing the apparatus involving the density makrix which is needed 
for the treatment of such quantum mechanical ensembles, the qumtmn 
anahgm of lAouville's theorem is derived. This permits us to discuss 
the conditions for statistical equilibrium, and to show the reasons for 
introducing as the basic assumptions of quantum statistics the hypo- 
thesis of equal a priori probabilities and of random a priori phases for the 
quantum mechanical states of a system. At the correspondence prin- 
ciple limit these assumptions taken together agree with the classical 
hypothesis of equal a priori probabilities for classically defined states. 



8 


INTBODtrCTION 


f'http. I 


We regard these assumptions as reasonable poHlulalcs to intro- 
duce, but to be ultimately justified, however, by the agrceincni. !>t*tw«>on 
deduced and empirical results. We conclude the chapter with a general 
discussion of the validity of statistical quantum nuuihanicN t<igefhcr 
with remarks on the quantum mechanical status of the crg<)<li«* hypo- 
thesis. 

In Chapter X, ‘The Maxwell-Boltzmann, Einstein -Bose, and Fenni- 
Dirac Distributions’, we apply the methods of quanttun statistics to 
systems in equilibrium. To do this we regard a system in ecjnilibriuin 
with an approximately known value of its energy, as appropriately 
represented by an ensemble of systems having that approximate energy, 
distributed in accordance with the hypothesis of equal « priori proba- 
bilitieB and random a priori phases, and itself in statistical etjuilibrium. 
We then consider the most probable condition in such an ensemble, arid 
are led in the case of systems composed of many similar weakly inter- 
acting elements either to the Maxwell-Boltzmann, Einstein- Hose, or 
Fenni-Dirac distributions, according to the typo of symmetry rest ric- 
tions — none, s3nnmetrio, or antisymmetric— -to be satisfiiMl by the 
solutions of Sohroedinger’s equation in the cases of the different kinds 
of particles or other elements that occur in nature. These results are of 
great practical importance, and we give some indication of their applica- 
tion to gases, crystals, conduction electrons, and radiation. 

In Chapter XI, ‘The Change in Quantum Mechanical Systems with 
Time’, we now prepare the way for our later derivation of tlm quantum 
mechanical form of Boltzmann’s JT-theorem by studying the nature of 
the changes with time that take place in quantum mechanical systems. 
On account of the greater power of quantum as compared with 
classical methods, these considerations are fortunately much ntoro 
general than our analogous classioal considerations, which were mainly 
concerned with molecular collision as the mechanism of change. It is 
first shown that thepnn(%)Ze of dynamical revereibilUy holds also in 
the quantum mechanics in appropriate form, indicating that quantum 
theory supplies no new kind of element for understanding the actual 
irreversibility in the macroscopic behaviour of physical systems. The 
next section of the chapter is then devoted to the mathematical problem 
of the integraiion of Schfoedinger’s equaiion for the change in state of 
a system with time, by the method of va/riaMon of constants and by the 
more general method of treating change with time as a mitarp trans- 
formaion. And a farther sebtion is devoted to the mathematical 
problem of treating the effect of changes loith time in the external para- 



5 3 STATISTICS AND THERMODYNAMICS 9 

mMers of a system. We then turn to more physical matters by con- 
sidering the character of the actual observations and specifications of 
state that could be appropriately made for a quantum mechanical 
system which is changing with time. We next consider the theory of 
transitions that take place in a system with a probability proportional 
to the time, and treat the special case of transitions that take place 
in a gas as a consequence of collisions. We conclude the chapter with 
a brief general treatment of changes with time in ensembles of systems. 

In Chapter XII, ‘The Quantum Mechanical J5f-theorem’, we complete 
our study of the essential features of the quantum statistical mechanics. 
The chapter is divided into three parts. In Part A we derive the 
quantum mechanical fl'-theorem first in a form similar to that of Boltz- 
mann, where H is regarded as a quantity directly characterizing the 
system of interest, and then in a form similar to that of Gibbs, where S 
is introduced as a quantity directly characterizing the representative 
ensemble for the system of interest. In Part B we then discuss the 
relation of the decrease in H with time to the ultimate equilibrmm 
condition for a system of interest, and show the reasons for taking a 
miorocanonical ensemble as representing equilibrium for a perfectly 
isolated system, and for taking a canonical ensemble as representing 
equilibrium for a system in contact with a heat bath or in what we 
shall call essential rather than perfect isolation. In Part C we make 
use of the canonical ensemble in studying some specific examples of 
equilibrium and conclude with remarks on the quantum form of the 
principle of detailed balance. 

4. Statistical mechanics and thermodynamics 
In the third and final part of this book. Chapters XIII and XIV, 
we discuss the application of statistical mechanics to the problem of 
obtaining a mechanical explanation for the phenomena of thermo- 
dynamic behaviour. We cannot refrain from doing this, even though 
our primary purpose has been to expound rather than to apply the 
principles of statistical mechanics. The explanation of the complete 
science of thermodynamics in terms of the more abstract science of 
statistical mechanics is one of the greatest achievements of physics. 
In addition, the more fundamental character of statistical mechanical 
considerations makes it possible to supplement the ordinary principles 
of thermodynamics to an important extent. 

In Chapter XIII, ‘Statistical Explanation of the Principles of 
Thermodynamics’, we present the desired mechanical interpretation 
msM 0 



10 


INTRODUCTION 


C'luip. I 


and explanation of the first and second laws of thprmodyn(inu<'s. 1'*» 
do this the thermodynamic variables ordinarily ustnl io tlescrilM’ a 
thermodynamic system of interest, such as volume, pressure, enerjiy, 
entropy, and temperature, are correlated with me()!mni<*nl <iuanti(ies 
belonging to a representative ensemble that corresponjls in an appm- 
priate manner to the system in question; and the ihermndynamie. 
behaviour of the system is then correlated with the mean heliaviour 
in this representative ensemble. The proiKsr st^if istieal eorrelates for 
essentially mechanical quantities such as volume, pressure, and energy 
are readily obtained. The correlates for the quantities entropy ami 
temperature, which thermodynamics has intro<hioed on its own level 
of abstraction, need more discussion, however, awl the greater part <»f 
the chapter is devoted to showing that the correlates chosen do havo 
the correct properties. The ultimate outcome is the derivatitui of two 
statistical mechanical relations which are the same in form awl e<pii- 
valent in implications to the thermodynamic relations which esjjress 
the first and second laws of thermodynamics. 

In Chapter XIV, ‘Further Applications to Thermo<lynainicH', we luiw 
apply the consequences of the preceding chapter to a njunber of illus- 
trative oases. Without such considerations, the increased |)ower of 
statistical mechanics in obtaining actual expressions for the quantities 
used by thermodynamics would not be appreciated. As the statistical 
apparatus of calculation we use the sum-over-statos for the thermo- 
dynamic system of interest, and with its help wo treat tlio pro|)«rties 
of gases, crystals, and solutions. The problems of vapour pressure and 
chemical equilibria are also treated, and the present status of the so- 
called third law of thermodynamics is discussed. This is followed by 
a discussion of the use of grand camnical ensembles as roprosonting 
equilibrium for ‘open’ thermodynamic systems. The chapter ends with 
a discussion of fluctuations in statistical ensembles, the limitations 
thereby imposed on thermodynamic concepts, and the observational 
detection of fluotoations in systems at thermodynamic equilibrium. 

5. Points of view and methods of presentation 

Before undertaking the programme outlined above, a few remarks 
may be made concerning the points of view that will be adopted as 
fundamental, and methods of presentation that will be selected as con- 
venient. 

It is first to be emphasized, as already indicated above, that the 
treatments given in this hook, to the classical and to the quantum 



§5 


VIEWPOINT AND METHOD 


11 


mechanics, and to the two corresponding forms of statistics, are to he 
regarded as deductive expositions based on postulates. These postulates 
will be chosen in ways which are familiar or plausible, but their ultimate 
validity will be regarded as resting on the correspondence between 
deduced results and empirical findings. It is hoped that the postulatory 
expositions of classical and of quantum mechanics that are given will 
seem compact and useful treatments without reference to their further 
employment for statistical purposes. 

In building up our deductive systems we shall not try to present 
a complete set of indefinables, definitions, and postulates, nor to achieve 
the rigour of demonstration that would be demanded for the perfect 
construction of a logical universe of discourse. We shall, however, try 
to give expositions that will provide a real insight into the physical 
nature of the situations to be treated. For the classical mechanics we 
take Hamilton’s principle, with the associated ideas as to generalized 
coordinates and velocities and the Lagrangian function, as a starting- 
point which provides an appropriate generalization of Newton’s laws 
of motion. For the quantum mechanics we take Schroedinger’s equa- 
tion including the time, with the associated ideas as to probability 
amplitudes and the Hamiltonian and other operators, as fundamental. 
For the extensions to statistical mechanics we take the assumptions of 
equal a priori probabilities for classical states, or of equal a priori 
probabilities and random a priori phases for quantum mechanical 
states, as the necessary additional hypotheses. 

In developing the quantum mechanics itself we shall only strive for 
a non-relativistic theory. This will be sufficient for filustrating all the 
essential features of the quantum statistical mechanics and for treating 
all the commoner problems to which statistical methods are applied. 
The development will be carried out making use for the most part 
of coordinate representations for the quantum mechanical states of a 
system; thus it will start from Sohroedinger’s equation expressed in 
coordinate or g'-language. The ideas of the transformation theory will 
bo introduced later. The quantum mechanical notation used will be 
similar to that of Pauli in his HartMuch article, and our own develop- 
ment will lean heavily on his exposition.t 

As the fundamental idea involved in statistical treatments, we take 
the procedure of correlating any actual mechanical system of interest, 
in an incompletely specified state, with an appropriately chosen repre- 
sentative ensemble of such systems, distributed over a range of states, 
t Pauli, Handbuch der Phyaik, sooond edition, volume 24/1, Berlin, 1933. 



12 


INTRODUCTION 


Chap. I 


followed by the procedure of then using average vahwMJ for the nicinlwrH 
of this ensemble as furnishing good estimates as to what we can exjwct 
for the actual system. In this connexion different possibilities are open 
as to what representative ensemble we select as apj)ropriatt*, and as to 
what averages we take as convenient. 

In the treatment of equilibrium we shall first adopt the familiar 
method of representing a system at equilibrium by a inicroean<>ni<utl 
ensemble — composed of members all of which have substantially the 
same energy as the system of interest — and of then taking the most 
probable values of quantities in the ensemble as convenient averages. 
We shall use this method in the derivation of the classical Maxwell- 
Boltzmann distribution law which we give in Chapter IV , and in the 
derivations of the quantum Maxwell-Boltzmann, Einstein-Bost?, ami 
Eermi-Dirac distributions which we give in Chapter X. Such a inethmi 
of procedure is theoretically quite sound provided the system of intercsl. 
can be regarded as having a practically definite energy at e((Utlibriuni. 
This is, of course, true for perfectly isolated systems which have I«H*n 
allowed to come to equilibrium, and is nearly enough true for systems 
which have been allowed to come to thermal equilibrium with their 
surroundings, provided the number of degrees of freedom for the system 
is sufficiently large. 

At a later stage of the development we shall then adopt the some- 
what less common method of representing a system ut equilibrium by 
a canonical ensemble — composed of members distributed to oorresjjond 
to the temperature of the system of interest — ^and of then taking the 
mean values of quantities in the ensemble as convenient averages. We 
shall use this method both for a new and more satisfactory treatment 
of the three molecular distribution laws mentioned above, and shall 
make it fundamental in our treatment of thermodynamic equilibrium. 
Such a method of procedure, as first appreciated by Cibbs, is theoretic- 
ally a sound one when the system of interest can be regarded as having 
a definitely specified temperature. 

It win also be of interest in this comxexion to note a further alter- 
native method of treatment employed by Fowler in his extensive and 
important treatise on statistical mechanics, j* There, a system in a con- 
dition of equilibrium is represented by a microcanonical ensemble of 
systems all having substantially the same energy, but mean rather than 
most probable values are taken as tiie averages to be considered. These 
means are calculated by an approximate method involving saddle-point 
t Fowler, Statistical MecTuxnica^ second edition, Cambridge, X93C. 



§e VIEWPOINT AND METHOD 13 

integration. Por systems of many degrees of freedom, all three of the 
methods mentioned lead to results which are in substantial agreement. 
The finding that somewhat altered treatments can lead to practically 
the same consequences has long been familiar in statistical mechanics.t 
In carrying out our development of the quantum form of statistical 
mechanics we shall make special efforts to distinguish clearly between 
those uncertainties in an observational situation, which are inherent in 
the quantum mechanics itself, and those additional uncertainties, due 
to lack of maximal knowledge, which are the primary concern of 
statistical mechanics. In the quantum mechanics, even if we had a 
maximal observation, so that the system under consideration could be 
definitely assigned to a precise state, say h corresponding to an eigen- 
value Fjf of some observable quantity F, this woidd only be sufficient 
to permit assertions as to the probability for fin ding the system in 
a state m corresponding to an eigenvalue of some other kind of 
observable quantity 0, except in the special case when the operators 
for the two quantities F and 0 commute. Hence, even the quantum 
mechanics itself has a statistical character that is foreign to the classical 
mechanics. In developing what we specifically designate as statisti- 
cal mechanics, however, we are interested both classically and quantum 
mechanically in cases where our observations have not been sufficient 
to determine the precise state of a system, so that we have to consider 
the statistical properties of some suitable representative ensemble. We 
shall employ an appropriate notation to assist in keeping this distinction 
clear. Thus the symbol would denote the probability that a given 
precisely specified quantum mechanical system will be found in a state 
m, and the symbol would denote the probability that a system 
picked at random from a given ensemble will be found in state m. 
Furthermore, we shall use a single bar ~ to denote a mean which is 
obtained by averaging over different quantum mechanical possibilities, 
and a double' bar = to denote a mean which includes the additional 

t A word as to the relative advantages of the above three methods of treating systems 
in equilibrium may be of interest. The method of taking most probable values in a 
microoanonical ensemble is historically the earliest and often leads most expeditiously 
to the desired results. It runs into analytical difficulties when the most probable values 
are desired for integers which are too small to be regarded as corresponding to a con- 
tinuous variable ; for largo numbers it gives averages which agree in ordinary oases with 
those obtained by the other two methods. The method of taking mean values in a 
canonical ensemble seems conceptually the most satisfactory and the computations are 
relatively simple and straightforward. The method of taking mean values in a micro- 
canonical ensemble has been developed by Fowler into a very general and powerful 
scheme, but the computational apparatus is complicated. The means obtained agree 
precisely in ordinary cases with the means for a canonical ensemble. 



14 


INTBODUCTION 


process of aTeraging over the members of an ensemble of systems, 'riuis 
§ would denote the mean or expectation value for the obst'rvabh* (> 
in the case of a given precisely defined quantum system. an<l G would 
denote the mean value for all the systems in a given ensembh'. 

We shall base our development of the relations between statistical 
Tif)P.nlm.Tiing and thermodynamics very definitely on the point ol view 
of Gibbs, t that a thermodynamic system, in a condition sjaa'ilied by 
a limited number of macroscopic thermodynamic variables, can la* 
naturally regarded as a mechanical system in an incompletely speeiliefl 
microscopic state, and hence correlated for purposes of treatment with 
a representative ensemble of similar systems appropriately tlistributetl 
over a range of difiEerent precise states. Proceeding wit h this liiu* of 
attack, we shall not only be able to obtain mechanical correlatcw for the 
variables — such as entropy and temperature — whicli thernuKlynnnucs 
has specially introduced on its own level of abstraction, but shall also 
be able to show that the laws of thermodynamics are satisfaclorily 
correlated with the statistical mechanical laws for the Irehaviour (*f I ho 
appropriate representative ensembles. We shall not carry out. (liis 
programme until after we have obtained the principles of statistical 
mechanics in their full quantum mechanical form. Nevertheless, we 
shall fiind that the classical methods of Gibbs need but slight modification 
to he specially suited for a quantum mechanical dovolopiiumt of the 
relations between statistics and thermodynamics. One can only fw*l t hat- 
the work of this great ajid modest scientist was guided hy siieh iunda- 
meutal considerations that he based his development on those essential 
features of the relations between mechanics and thermuciynamicH whidi 
have not been altered by the change from classical to quantum mochanies. 

In addition to the transcription from classical to quantum language, 
there will be several less radical differences between the point of view 
and methods adopted by Gibbs and those employed in this b<«>k. 'fo 
these we may devote a few words. 

Impressed hy the real failure of the classical statistical mochanitts to 
give a valid account of the phenomena of radiant heat and of the 
specific heat of diatomic gases,J Gibbs was very cautious in making 
assertions as to the relation of the systems actually enttountereti in 
nature to the statistical ensembles which he discussed, |1 and seemed t«} 
regard the properties of these ensembles as providing analogies rather 
than explanations for the principles of thermodynamics. However, os 

t Qibbs, Ekmemtary Prvneiplea in Stt^iiadl Medhemies, Yale tTnivansity Prom, 1902. 

t Gibbs, loo. cit., p, 167. || Gibbs, loo. oit., p, x of tho profaco. 



§ r, VIEWPOINT AND METHOD 16 

these apparent failures of statistical mechanics have since been solved 
by the quantum theory, we can now afford to adopt a more positive 
attitude. This we do by clearly introducing and distinguishing between 
the two concepts of system of interest and corresponding representative 
ensemble, and by then asserting that an appropriately chosen ensemble 
would really permit valid predictions as to the behaviour to be expected 
on the average for a system of interest started off in successive trials 
in the same partially specified condition under consideration. This pro- 
cedure then involves a more explicit consideration of the hypothesis as 
to a priori probabilities and phases, used in constructing appropriate 
ensembles, that was needed in the work of Gibbs. 

A difference of emphasis, as compared with Gibbs, will lie in a more 
complete discussion of the justification that can be given for the use of 
the canonical distribution as representing a system which has arrived 
in a condition of thermodynamic equihbrium (see §§111 and 112). We 
shall regard this justiheation as sufficiently sound so that it will no 
longer seem necessary to consider the so-called second and third ana- 
logues for thermodynamic quantities which were based by Gibbs on 
the use of the microcanonical instead of the canonical distribution as 
representing thermodynamic equilibrium. 

In the application of statistical mechanics to thermodynamic pro- 
blems, Gibbsf has employed the abstract concept of ‘perfectly iso- 
lated’ systems, in which there would be absolutely no interaction 
between a system and the walls of its container or its other immediate 
surroundings, and has used this as the appropriate idealization for the 
treatment of systems which would be regarded as isolated from the 
point of view of thermodynamics. In this connexion, however, we shall 
introduce (§111) a new concept of ‘essentially isolated’ systems, in 
which interaction between a system and its surroundings without 
change in mean energy would be allowed, and shall take this as the 
appropriate idealization to use in the treatment of systems regarded 
as isolated from the thermodynamic point of view. This will make it 
possible (§124(ffl)) to give a better treatment to thermodynamic, re- 
versible, adiabatic processes than was possible for Gibbs. 

In conclusion it may be mentioned that in carrying out our treatment 
we shall try to be reasonably scrupulous in restating the significance of 
any symbol when it reappears in the text after a long interval. In 
addition, a list of symbols employed for different purposes will be found 
in Appendix I. 


t Gibbs, loc, cit., p. 164. 



II 

THE ELEMENTS OF CLASSICAL MECHANICS 

6. Introduction 

It is the purpose of the present chapter to give a brief outline of the 
principles of the classical mechanics, which will bo Huflieient. for the 
development of the classical statistical mechanics, and also suflieient. 
for an understanding of the later change to the poitit of view of the 
quantum theory. 

The fundamental ideas of the classical mechanic are, firstly, Ibat. 
of n, TUftcbanical system may be obsorvationally dotcrminotl 
at any time of intOTMt; secondly, that the specification of this state 
oombioM with a Itoowledge of the nature of the system will then Iw 
sufficient to determine the values of all the mechanical properties of 
thesySem at that time; and, finally, that the speciftfuition of the state 
when further combined with the laws of mechanics will also lie sufficient, 
for a complete determination and prediction of tho future behaviour 
of the S 3 rstem. For example, in the case of a system of interacting 
particles, the state at any time could be specified by the iiositions and 
velocities of the component particles ; combined with a knowledge of 
the masses of the particles and the forces they exert this wjjwld then 
he sufficient to determine the values of mechanical quantities such as 
the kinetic and potential energy of the system, its components of 
momentum, etc. ; and further combined with a knowledge of tho laws 
of mechanics, say Newton’s laws of motion, it would also be sufficient 
to determine the complete future behaviour of the system. 

This general idea that the observation of instantaneous state shouki 
he sufficient for a complete determination of the condition and future 
behaviour of a system has found increasing application ever since tho 
time of Newton. Combined originally with the laws of Newtonian 
mechanics, it provided treatments for the dynamics of systems of 
particles, of rigid bodies, and of oontmuous fluids and solids. Combined 
at a later time with the more precise laws of relativistic mechanics, it 
sucoessfolly provided a refined treatment for those same problems. 
And combined with laws as to the behaviour of electricity and of heat, 
it controlled the developments of electrodynamics and thermodynamics. 
By the end of the nineteenth century it so dominated the whole of 
scientific thought as to seem a necessary part of the axiomatic structure 
of any kind of scientific procedure. 



§6 


STATE OF A SYSTEM 


17 


From the point of view of modem quantum theory, however, we now 
know that this idea of the complete determination of a system by an 
observation of its state was based on too superficial view as to the 
effect of observation on the system itself; and with the development 
of a consistent and empirically verified quantum mechanics, we now 
realize that such an idea is not a necessary axiom of science. From 
our present viewpoint the classical mechanics then becomes a special 
case of the quantum mechanics, valid at the so-called correspondence 
principle limit where special quantum effects can be neglected. Never- 
theless, this special case includes, of course, all the phenomena that 
could be successfully treated by classical methods, that is, all those 
phenomena involving sufficiently massive objects so that the unknown 
disturbing effects of observation are negligible. This makes the range 
of validity so wide that the classical mechanics may stiU be developed 
with profit as a separate discipline. To such a development we now 
turn. 


7. State of a system. Generalized coordinates and velocities 

In the classical mechanics we may regard the state of a system as 
specified by the momentary positions and velocities of its parts. For 
example, in the case of a system of particles, we could specify the state 
of the system at any moment of interest by the Cartesian coordinates 
X, y, z for the positions of the different particles, together with the 
correi^ondiDg components of velocity x, i, which give the rates of 
change, = dxldt, etc., taking place in these coordinates with respect 
to time. 

In specifying the state of a mechanical system we need not regard 
ourselves as limited, however, to the use of Cartesian coordinates. 
Bather, we shall find it advantageous to introduce firom the very start 
the idea of a set of gmeraUzed coordim/tea 


2i> in 


(7.1) 


which can be used for specifying the instantaneous configuration of 
a system of interest, together with a corresponding set of generedized 


iv iz 


(7.2) 


where the dots indicate rate of change with respect to time, which can 
be used for specifying the instantaneous motion of the system. For 
example, in the case of an oscillating pendulum we might find it con- 
venient to take as the single necessary generalized coordinate the angle 
6 made by the pendulum with the vertical; in the case of a rigid body 



18 THE ELEMENTS OF CLASSICAL MKCHANH’S Chiip. II 

we might use the Caxtesian coordinates for its centre of gravity togot lier 
with the three angles necessary for describing the. oruuitafion of the 
body; and in the case of two interacting particles it iniglit be e»tnvenieiii 
to choose the three Cartesian coordinates for the centre of gravity ol’ 
the pair together with their distance apart and tlie two angles gi v ing t he 
orientation of the line joining them. The values of such ecHinlhiafe.s 
together with their rates of change with time would then la* sullici(‘ni 
for specifying the state of the system. 

It will be possible to choose the generalized coordinates for a system 
in a great variety of ways; and transformations betwcfui the languagi’s 
corresponding to different choices can be made with the help of the 
equations connecting the values of one set of cooniinatos with thost* of 
another. In the absence of specific statement to the contrary, it may 
be assumed that so-called statwrwury coordinaten are being usevi for 
specifying the positions of the various parts of a system. 'J’hese an*, 
coordinates which can be derived by direct transformations, not involv- 
ing the time, from the coordinates corresponding to an nnai‘<'clemted 
set of Cartesian axes. More general coordinates can be treated by tin* 
methods of transformation to be discussed in § 14 . 

The number of coordinates introduced must, of course, always be 
sufficient for a complete specification of tho position anti I'oniiguralion 
of the system. It will he advantageous, however, to choose coorilinutes 
in such a way as to take the minimum number of independent variabh's 
necessary for the description. In the absence of speeifus statement to 
the contrary it is to be assumed that this has been done. 

Having chosen the minimum number of coordinates needed to Hpeeify 
the position and configuration, a system is called Ibolmmiic. if each of 
these coordinates could be independently varied without violating any 
constraints inherent in the nature of the system. A system is called 
lum-Mmmiic, on the other hand, when the structure of the Bystorn is 
such that the values of these coordinates are connected by relations 
which prevent their independent variation. These relations restricting 
the variations in the case of a non-holonomic system must bo non- 
integrable, since otherwise they could be used to reduce the number 
of iudependent variables chosen, and we have already flaa omed that tho 
minimum possible number has been taken. In the case of either kind 
of system, the number of actually wd&pendeM variatioM that can bo 
made in the coordinates is spoken of as the ttAmher of degrees of frsedtm 
of the system. 

The systems ordinarily encountered in applications are holonomic in 



§7 


GENEBALIZED COORDINATES 


19 


character, and the method of treatment applicable to them can be 
readily adapted to the non-holonomio case by introducing the restric- 
tions on independent variations at the appropriate point in the develop- 
ment (see § 9). Hence we shall only give a treatment to holonomic 
systems. 

Two remarks may be made in concluding this section on the speci- 
fication of the state of a system, with the help of a set of generalized 
coordinates and corresponding generalized velocities. 

In the first place, it may be emphasized that the procedure tacitly 
assumes the possibility of making a simultaneous observational deter- 
mination of all the coordinates and velocities needed for the specifica- 
tion. When we come to the development of the quantum mechanics we 
shall find that this can be regarded as true only as a limiting case. 

In the second place, it may be noted that the formulation assumes 
that a specification of coordinates and velocities alone is sufficient for 
the complete determination of instantaneous state, and hence also 
sufficient for the prediction of future behaviour when combined with 
a knowledge of the nature of the system and of the laws of mechanics. 
For example, it has been tacitly assumed that accelerations and higher 
derivatives do not have to be included in the specification. The develop- 
ment of classical methods has shown this to be actually realizable 
although it may be necessary — as in treating the interactions of elec- 
trically charged particles — ^to introduce the idea of a fidd with what 
amounts to three coordinates and velocities at each point of space, thus 
achieving the desired formulation only by making the number of degrees 
of freedom of the system infinite. 

8. Hamilton’s principle and the Lagrangian function 

Having discussed what is meant by the instantaneous state of a 
system, we must now consider a formulation of the laws of mechanics 
which will determine the changes in state which take place with time. 
As a very general and satisfactory postulatory statement of these laws 
we shall choose Hamilton’s priwiple, which can be expressed by the 
variational equation c, 

8 f L d!« = 0, (8.1) 

<1 

where the so-oaUed Lagrangian function L is a quantity which depends 
on the coordinates and velocities, and in some cases also explicitly on 
the time in a manner to be discussed later. 

This variational equation is to be mderstood in the following sense. 



20 


f'lmp. tl 


THE ELEMENTS OF CLASSICAL MECHANICS 

We consider the actual path by which the system passes from an inil ini 
configuration at time to a later configuration at time and r-ompnre 
the value of the integral j Ldt over this actual path wi< h <h»‘ values 
which it would have on neighbouring paths by which tlxi syslem «-t)uld 
be carried in ihe same time^interval from tha mrm inifinl to thr mnnr httrr 
amfiguration, without violating the constraints of the system. Hanul- 
ton’s principle then states that the value of the integral J L tit for the 
actual path has in any case a stationary valw ns compawwl with such 
neighbouring paths. 

The above compact formulation of the laws of mcehanies has, of 
course, no actual content until we specify the form of t he fjttgningmn 
function L. Indeed from a certain point of view wo can rogiml the 
empirical determination of the form of L, for the various kinds of systems 
— ^mechanical or otherwise — amenable to Hamilton’s principle, ns one 
of the major tasks of the classical physics. It should he remarke«l that. 
Newtonian systems can be devised, for which the equations of motion 
cannot be derived from a variation principle; hut fur our puriwwes 
Hamilton’s principle provides an adequately general hnsis. 

In considering the form of the Lagrangian function we must first 
make a distinction between conservative and nm-ronaermtire aystma. 
In the case of conservative systems the Lagrangian function defH'nds 
explicitly only on the coordinates and velocities which specify the state 
of the system. Thus for a holonomic system of / degrees of freedom 
we have L = (8.2) 

On the other hand, in the case of non-conservative systems the Lagran- 
gian function depends also explicitly on the time, so that wo have 

L = L{g^ ... qf, ... t). (ft<3) 

Resorting for the purposes of a quick description to language that 
will already be familiar, conservative systems will prove to bo those in 
which the sum of the kinetic and potential energies remains constant 
in time. They are thus isolated systems, unacted on by external forces, 
and having no internal forces of dissipative— for example, frictional- 
character. Non-conservative systems, on the other hand, will bo those 
in which the sum of the Mnetic and potential energies is not constant 
because of the action of either external or dissipative forces, 

It is evident in this connexion that any system may be treated 
as isolated, and hence unacted on by external forces, provided we 
include a large enough region inside the boundary that we take for 
the S 3 ^tem> Furthermore, in view of the conservation of energy for 



§8 


LAGRANGIAN FUNCTION 


21 


fundamental particles, it is evident that dissipative forces may be 
eliminated from consideration if we introduce a sufficient number of 
coordinates, corresponding to the fundamental particles of which the 
system is constructed, so that thermal phenomena will themselves 
receive a mechanical treatment, which is one of the goals of statistical 
mechanics. Hence it becomes possible in principle to treat any situa- 
tion by the mechanical principles applicable to conservative systems. 
Nevertheless, since it is desirable in the situations of interest for statisti- 
cal mechanics to preserve the distinction between system and external 
surroundings which can act thereon, it is most useful to develop the 
principles of mechanics in their general form where the Lagrangian 
function L contains the time t explicitly and then consider the simpli- 
fications that result for conservative systems. 

Having made the general distinction between the two possibilities, 
that the Lagrangian function L may or may not depend explicitly on 
the time, we must now give consideration to the specific form of L, 
since the actual content of Hamilton’s principle, as already remarked, 
will bo dependent on the form assigned to L as a function of the 
coordinates qi, velocities q^, and time t. 

For simple mechanical systems, where the forces acting on the different 
parts are derivable from a potential and the velocities are small com- 
pared with that of light, the Lagrangian function L can be taken equal 
to the difference between the kinetic energy of the system T and its 


potential energy V, 


L = T-V. 


(8.4) 


The kinetic energy T in this expression is to be regarded as determined 
from the masses m and velocities v of the different parts of the system 
by a sum or integral over the kinetic energies mv^ji of these different 
parts. And the potential energy F is to be regarded as the work neces- 
sary to bring the system of interest from some standard internal con- 
figuration and location with respect to external bodies to its actual 
configuration and location at the instant being considered. In making 
any application, these quantities are, of course, to be expressed as 
functions of the particular set of generalized coordinates and corre- 
sponding velocities that have been chosen for use. Assuming the use 
of stationary coordinates, as defined in § 7, it will be seen that the 
kinetic energy T will necessarily be a homogeneous quadratic function 
of the generalized velocities with coefficients which can in general 
depend on the coordinates q^. It will also be seen that the potential 
energy F will be a function, of the coordinates alone in the case of 





22 THE ELEMENTS OF CLASSICAL MKC'HAXK'S 

conservative systems, and of the coordinates and time t iji tlu* rase 
of non-conservative systems. 

For more complicated mechanical systems, or even ium-inr<-}iiinlral 
ones, suitable choices of the Lagrangian may bo |wtssiblo wlu.*h will 
mahe Hamilton's principle still applicable soinotimos in grnnalizrd 
form. In the case of a system of particles having voloritios v sHllicirnf ly 
high compared with that of light c so that the spcwnal theory «»f relativity 
must be applied, the quantity T appearing in («.4) has only to Imi 
replaced by 2moc2{l— t;®/c®)}, where <lcnotcs the rest mass of 
a particle and the summation is over all the particles of the systotn. 
In the case of systems having additional forces beyond those dorivnhlo 
from a potential, the expression for the Lagrangian funct ion given by 
(8.4) may sometimes be modified by making an appropriate n<idition 
of a further function. (See, for example, §50 (ft) for iIm* added term 
corresponding to the magnetic force acting on a nu)viug chnrged 
particle.) In the case of combined mechanical and elect rrunagnetic 
systems, the Lagrangian function may bo suitably set up by including 
terms which correspond to the conditions at all points of tfic el<Mdr<»- 
magnetic field. In the case of gravitational ofTects so great that the 
general theory of relativity has to be applio<l, a variational etpiation 
can be employed which can be regarded as a gcneriiHx,ntioit of IfamiU 
ton’s principle. Thus, with a correct choice for the Ijagrangian futtet i*m 
L, Hamilton’s principle can be regarded as having a wide rang** of 
applicability. We shall ourselves he mainly interested in the siiitpie 
case of ordinary mechanical systems as given by (8.4). 

In concluding this section on Hamilton’s principle, it may «•- 
marked that an important advantage of taking this parti<!ular principle 
as the starting-point for mechanics lies in the consuieration flint the 
form of expression, as given by (8.1), is independent of the coonlinat«*M 
which we may desire to use. The principle is valid without refereneo 
to the particular choice of generalized coordinates and veloc’ities in 
terms of which the Lagrangian function is expressed. 

■ It may be fterther remarked that the feasibility of applying Hamil- 
ton’s principle to such a wide variety of different kinds of systoms is 
certainly dependent to a considerable extent on the flexibility open to 
ns in picking a form for the’Lagrangian function L which will actnnlly 
lead to a coxiespondence with empirical findings. Also connected with 
this is the possibility already mentioned in § 7 of introducing the notion 
of a field in order to make an mfimte number of generalized coordinates 
and velocities available for inclusion in the Lagrangian function. 



( 23 ) 


9. The equations of motion in the Lagrangian form 
Having taken Hamilton’s principle as a convenient, compact formula- 
tion of the laws of mechanics, we must now obtain its consequences in 
forms suitable for computational applications. For this purpose we 
may first derive the equations of motion of a system in their so-called 
Lagrangian form. 

Considering a holonomic system of / degrees of freedom, we can write 
Hamilton’s principle in the form 

8 ( dt = 0, (9.1) 

i I 

where L is a function of the / generalized coordinates ffi,...,?/ and the 

corresponding velocities together with an explicit dependence on 

the time t in the case of non-conservative systems. Since the variations 
in path are to be taken without changing the time of transition to 
from the initial to the later configuration, the above may be rewritten 
in tlio form t, 

0 = J 8L{qi...qf,qi...q,,t)dt 



where Bq^ and Bq^ denote the variations in the ith coordinate and 
velocity on passing from the actual to a displaced path, and we take 
a sum over all / degrees of freedom. 

In order to simplify this expression, we note that we may put 

^ (9.3) 

since this merely asserts an equality between two different modes of 
expressing a generalized velocity on the displaced path. This then 
gives us , 

Mi = ~Bqi. (9.4) 

Hetuming to (9.2), and considering the terms in the integrand involving 
Bqi, wo can then write 




24 the ELEMENTS OF CLASSICAL. MKCHANK-S t Imp. 11 

where the term between limits has been taken as equal to /.ero. siiu’e 
by hsrpothesis the system is to have the same ecHifigurat ion at tiincs 
and <2 on the actual and displaced paths, thus nnikinfj S*/.' 
zero at those times. 

Substituting (9.5) into (9.2), and changing signs, we can then write 


■-{). ( 9 .«) 

Hi Hi) 

'Sot a h olono Tnic system, however, the variations Sr/, Wf)ul<l is* arlii- 
traiy.f Hence the above equation can hold in general r>«»ly if lii« / 
individual equations 



d dL dJj .» , 

mret, 


(».7) 


one for each degree of freedom, are themselves valid. Thest* are the 
desired equations of motion in the Lagrang’ian form for a holonoinie 
system of / degrees of freedom. 

To illustrate the significance of these equations wc may eonsitler the 
simple case of a system of particles, moving with veloeities small eom- 
pared with that of light, and acted on by forces- -depentling on their 
positions relative to each other or relative to outside Ixalies which 
are derivable irom a potential. For the l.iagrangian function wc cun 


then take 


T-V, 


(tl.H) 


where T is the kinetic energy and V is the potential onei^y c*f the 
system. Denoting the mass of the Ath particle by m/j., and rdioosing as 
our generalized coordinates the coordinates a;*, which would give 
the positions of the various particles k with respect to an inmcrccieratcd 
set of Cartesian axes, the kinetic energy would bo given by 

2’ = |M(»H-2)]|+4), (1M>) 

and the potential energy would be given by 

V^ViXi j/i Zj ... a;* y* t), (9. 1 0) 

where the explicit dependence on time t would bo oorrelato<l with tlio 
behaviour of the external bodies exerting forces on the particles. Huh- 
stituting these expressions for T and V Into (9.8), and using the 


t The development can be modified at this point for the treatment of non-holononuo 
systems by introducing the equations— arising firom the structure of th© Mystom— whsuh 
limit the independent variations of the coordinates 



(9 


THE LAGRANGIAN EQUATIONS 


26 


resulting form of Zr in the Lagrangian equations (9.7), we are led to 
results of the form , at/ 

dr * * acft ’ 

= «, (9.11) 

= «, 

for each of the particles h. Furthermore, from the definition of the 
potential energy in terms of the work done by displacing the particles 
against the forces acting on them, these equations could also be written 
in the Newtonian form 

= Fyk = »iki/k> ^ac = ^kh> ( 9 - 12 ) 

where JJ,*, and are the components of force acting on the Mih 
particle in the z-, y-, and ^-directions. 

We thus see, for this simple case, that the starting-point for the 
development of mechanics provided by Hamilton’s principle, combined 
with an appropriate choice for the Lagrangian function, leads to results in 
agreement with the original starting-point of Newton. The above simple 
example also indicates the possibility of supplementing the treatment 
by introducing an appropriate ‘force function’ in order to secure agree- 
ment with the Newtonian starting-point in more complicated cases 
where the forces acting on the particles cannot be derived from a 
potential. Indeed we may conclude the present section by emphasizing 
that Newton’s laws of motion could themselves have been taken, in 
the special case of systems of Newtonian ‘bodies’ or particles, as the 
starting-point for a derivation of Hamilton’s principle. By starting 
with Hamilton’s principle, however, we have chosen a postulate which 
applies to systems in some respects more general than those of Newton, 
and which is conveniently expressed m language independent of the 
coordinate system. 


10. Generalized momenta, the Hamiltonian function, and the 
canonical equations of motion 

The equations of motion in the Lagrangian form 


d dZi dL 
dt Bqt 8qi 


0(i = W) 


( 10 . 1 ) 


for a holonomic system of / degrees of freedom are seen to be a set of 

3695.36 n 



THE ELEMENTS OP CLASSICAL MROHANK’S 


26 


Chap. T1 


/ second-order equations. For many purposes it is deairahle to replace 
them by an equivalent set of 2/ first-order oquatiotw. 

To secure this we may begin by defining a new set of quunfitica, 
Pt>—>Pp equations 


These we call the generalized momenta corresponding lo the generali/.ed 
coordinates g^. Using Cartesian coordinates .r. y, 2 for the ease of 
a single particle of mass m, it will be seen that the (tniTesponding 
momenta would be equal to the ordinary com|)onents of inotnentuin 
md, and mz for the particle. 

With the help of these momenta we may now also defim' another 
new quantity E by the equation 


^ !?/“■&. ( 10.3) 

where q^,...,qf are the generalized velocities and L is tho Ijtgrangian 
fimction for the system. Differentiating this ex{)ression wet obtain 


dE = Pldql+.-+Pf^f+gxd/Pl^\-...■^q|dp|~^ 




an.. i>L ,, 

dii, i f 


or, making use of (10.2), 


dE - (10.4) 

since two sets of terms in the previous form of expression are wen I .0 
^oel. This final form of expression shows that tho qjmntity // could 
itself be regarded as a function of the coordinates ffi,,,., q,, tho inomentji. 
Pxf",Pf) and the time i, with the partial derivativos 


• 


ati 


' Si 


( 10 . 6 ) 


Sq,i si 

When E actually is expressed in terms of the coordinates, momenta, 
and time „ 

E == E{qx ... qf,px ... Pf, t) (10.0) 

It is called the Eamiltomm fmua^ for the system. It will bo noted 
from Ihe fom of the original equation of definition (10.3) that tho 
distmetion between conservative and non-conservative systems will 

respectively with the absence or presence of an 
exphoit dependence of the Hamiltonian on time. 

With the help of the foregoing we can now express the equations of 



§10 


THE CANONICAL EQUATIONS 


27 


motion in the so-called canonical form. Combining the definition of 
momenta given by (10.2) with the Lagrangian equations given by 
(10.1), we can evidently write 




dL 

dt 

■ dt\dqi, 

1 ~ Hi 


(10.7) 


for the rate of change in the ith component of momentum with time. 
Using this, the first two expressions in (10.5) can now be rewritten 
in the form 


^ = — and ^ . 

dt . dpi dt 8qf‘ 


( 10 . 8 ) 


These are the desired equations of motion in the Hamiltonian or ccmonical 
form. With a known form for the Hamiltonian H as a function of the 
coordinates qi, momenta and time t, their solution would evidently 
allow us to calculate the values of the coordinates in their depen- 
dence on time. Since they are first-order equations, with a remarkably 
symmetrical form, they are often more convenient for fundamental 
investigations than the corresponding Lagrangian equations. The 
corresponding coordinates and momenta q^ and p^ which they contain 
are spoken of as being conjugate va/riables. 


11. The change in quantities with time. Poisson brackets 


With tho help of the equations of motion in the canonical form we 
can now obtain a useful expression for the rate of change with time in 
any quantity JP, which can be expressed as a function of the coordinates 
qi and momenta Pi of the system of interest. From the assumed 
dependence of F on and p^, and not explicitly on the time, we can 


evidently write 

dt ^ \dqi dt 8pi dt )’ 


( 11 . 1 ) 


where wo take a summation over all degrees of freedom i. And sub- 
stituting from the canonifcal equations (10.8), for the rates of change 
with time in the coordinates and momenta, this assumes the convenient 


form 


dt ^ \8qi dpi dqt dpj' 


( 11 . 2 ) 


Combinations of quantities such as that appearing on the right-hand 
side of (11.2) are quite important in the dassical mechanics and have 
received the special name of Poisson brackets. For any two functions 



2g THE ELEMENTS OF CLASSICAL ME«*HAN!«’S II 

M and N of the coordinates and momenta of a Hyatein, t ho Polmon 

bracket {M, N} is defined by 

^ ^[dMdN dN(hM\ 


{M,l!r} 




f cK^f\ 

'I !^Pil' 




where a summation is taken over all the degrees of fn*«‘dom of I he 
system i. 

Hence the expression for the rate of change with tim<‘ of any fiinrtion 
F of the coordinates and momenta of a system can also be writ ten in 
the abbreviated form 


dt 




(n.4) 


This mode of writing will be of interest wlien comparison is Infer iimde 
with the analogous quantum mechanical oxpro8sif>n for the rale of 
change with time in the expectation volnec of quantities. See §f$3 (a). 


12. The integral of energy and the interpretation of the llami!- 
tonian function 

With the help of our general expression for the. rule of chnnge of 
quantities with time, we can now invoatigafpo the rnte of change of the 
Hamiltonian function itself. We shall bo interested in the caH<^ of mn- 
aeroative aystema, where H would not be an explicit ftinction of the f iine 
t. Substituting H in place of F in the fundamental equati<tn { 1 1 .2), for 
the rate of change in time of quantities which are solely functions of 
the coordinates and momenta, we then obtain 

dt ^\dqi~dpi dqt'^J ■ “• ^ 


We now see that the Hamiltonian function for a conservative system 
would be a quantity which does not change with time. Wo have tlms 
obtained one integral of the equations of motion for a eonservativo 


system 


S{qi,Pt) = E, 


( 12 . 2 ) 


where F is a constant. 

The validity of this result depends on the oircumstanoo that wo 
have confined our considerations to conservative systems, and in oaso 
we employ stationary coordinates as defined in § 7 the constant E turns 
out to be equal to the quantity ordinarily defined as the energy of the 
system. Hence the Hamiltonian function B may now be interpreted 
as an expression for the energy of the system in terms of its coordinates 
and momenta. 


In the case of simple mechanical systems the interpretation of the 



BNEKay 


29 



Hamiltonian as the energy of the system agrees with familiar usage. 
For such systems the Lagrangian function is given by 


L^T-V, 


(12.3) 


where T and V are the so-called kinetic and potential energies of the 
system, as already defined in connexion with (8.4). Substituting this 
into the original expression (10.3) used for defining the Hamiltonian, 


we have 





= 2T-T->rV 

= T+V, ■ (12.4) 

where the second form of writing comes from the definition of the 
momenta given by (10.2), the third form of writing comes from 
the fact that T is the only part of the Lagrangian containing the 
velocities and the next to the last form of writing comes from the 
consideration that our definition of T was such as to make it a homo- 
geneous quadratic function of the velocities in any system of 
stationary coordinates. The final result, combined with our original 
definitions of T and F, shows that the Hamiltonian is equal to the 
usual expression for the energy of a simple mechanical system in terms 
of the spatial location, masses, and velocities of its parts. 

In the case of more complicated conservative s 3 rstems we can regard 
the constant quantity H as defining what we shall call the energy of 
the system. Thus, for example, in the case of a system of particles 
moving with high enough velocities so that the special theory of 
relativity has to be applied, the Lagrangian function may be taken as 

L = 2 WoC®{l-V(l-uW-^ (12.5) 


and the Hamiltonian is then found to be equal to the relativistic 
expression for energy 


V I y 


( 12 . 6 ) 


And treatments can be given in non-mechanical oases where 

Hamilton’s principle applies. 



30 


THE ELEMENTS OF CLASSICAL MECUANtrs 


Chnp. H 


13. The integrals of linear and of angular momentum 
Our general expression for the rate of change in quantities with f irne 
can also be used to investigate the components of linear anrl of angitiar 
momentum for a system as a whole.f We may confine ntir <‘onsi(leru- 
tions to systems of ordinary particles acting tinder the. iniiuence <if 
forces which can be derived from a potential. For such systems, using 
Cartesian coordinates y^, a*, for the different particles of masses 
that compose the system, it will readily bo soon that the Knmilfuninn 
can be expressed in the form 


k * 

where the generalized momenta p^j^, p^^., etc., eom.«sj>onding to the 
coordinates used, are the ordinary components of momentum for thi' 
individual particles 

Pkk == Wife Puk = .Vfe. Pzk >»k ( I •*». a) 

The linear momentum in the a:-direotion for such a system may then 

be defined by p _ v 

J^x — ZPxk^ (l.h.J) 

A 

which takes a sum over the individual (!ompon<uiis of momentum in 
the ic-direction for the different particles and similar definitions may 
be given for the other directions. And the angular mmiirnlam around 
the «-axis for such a system may bo defined by 

“ I {^kPuk-VkPxk), ( 1 a.4) 


which takes a sum over the individual compoiKint s of angular momen* 
turn for the different particles i, and similar definitions may lie given 
for the otiier axes. We now wish to investigate the rate of <‘hnngo of 
these quantities in time, with the help of our general equation (1 1.2), 
for the rate of change of any function F(?,p) of the coordinates and 
momenta H 8E BIl eF\ 

dt ^\8q,i8p^ Sg't SpJ’ (I3.fi) 

where the summation is over all the degrees of freotlom /. 

For the rate of change in linear momentum with time, we imme- 
diately see from the foregoing that we shall obtain the general result 


Cambridge, 1917. P*®' Amlyma Dynamicg, wcond edition, f 144, 



§13 


LINEAE AND ANGULAR MOMENTUM 


31 


where the summation is over all particles k. For an isolated system of 
particles, moreover, it is evident that the total potential energy 7 
would depend only on the relative positions of the particles and would 
not be altered if they were all displaced by the same amount in the 
.r-direction. Hence, for the case of an isolated conservative system, the 
result reduces to ,p 

= 0. (13.7) 


Making use of the similar results for the y- and s-directions, we are thus 
provided with three further integrals for the equations of motion, 

Pg. = const., Py = const., Pg — const., (13.8) 

for the case of an isolated system of particles. 

Turning now to angular momentum, the previous equations (13.1), 
(13.4), and (13.5) are seen to lead to the result 

di "" 2 ^ 

The first term in this expression immediately cancels out, and by 
changing to polar coordinates Xji = rjc cos and y* = r^sin^^ around 
the 3-axis, the second term is seen to reduce to 


di 




(13.10) 


For an isolated conservative system in which the potential energy would 
be imaltered by a mere rotation of the system about the 3-axis, this 


then gives 


dt 


(13.11) 


Making use of the similar results for the x- and y-axes, we are thus 
provided with three still further integrals for the equations of motion, 
Mg, = const.. My = const., 3^ = const., (13.12) 

for the case of an isolated system of particles. 

In this and the preceding section we have thus obtained in all seven 
integrals of the equations of motion, and similar results hold not only 
for systems of particles but, in general, for isolated conservative sys- 
tems.t These results are often spoken of as the conservaMon h/m for 
the energy, for the three components of linear momentum, and for the 
three components of angular momentum of an isolated system as a 

t In the case of systems which have to be treated by the methods of gm&ral rdativUy, 
the analogue of the principle of the conservation of angular momentum has not been 
obtained) so far as the writer is aware. 



32 THE ELEMENTS OF CLASSICAL MKCHANK'S Clmp. H 

whole. In the case of systems of a few degrees of freo<lom t hm‘ iiitcgmlH 
alone may be sufficient to provide a complete Holution of the ecjuatif.ns 
of motion. 


14. Canonical transformations 

In order to complete this brief account of the cluHaical meelmnica we 
must now give some consideration to the }>o.sHil)iHty of making mon- 
general transformations of variables than those which we have hitherto 
had in minH This will then put us in a position to (-ormider gi-neral 
methods of undertaking the integration of the equations of motion, 
which are due to Hamilton and to Jacobi. 

In the forgoing we have already considered the possibility of using 
different sets of generalized coordinates in treating the hehiiviotir of 
a given mechanical system. We have regarded any such set. of co- 
ordinates, however, as directly connected by transformation ecpiations 
with the coordinates provided by some unaccelerate<l set of < 'artesian 
axes, and indeed have found it convenient to consider ojily so-called 
stationary coordinates in which the new variables wotdri be giv«*n in 
terms of the old variables by transformation e({uations 

which do not depend explicitly on tlie time. 

We now wish to consider the much more general possibility of tnins- 
forming from an original set of coordinates and momenta pf to a new 
set qi and p,f, where the equations of transformation 


= Mil - if, Pi - Pf, <), ^ j ^ 

Pi = Mii-i/,Pi-P/,i) 

are such that the new coordinates and momenta can Im functions laitli 
of the old coordinates and of the old momenta, and also de{>ond ex- 
plicitly on the time if so desired. In carrying out such transformations, 
nevertheless, we shall wish to Umit ouraelves to changes of a character 
which would make the new variables, whatever the original form of //, 
still satisfy equations of motion 


di dpi 



dJB 


(14.3) 


of the same form as the canonical equations (10.8) applying to the 
original variables. Under such circumstances we shall then B|)eak of 
the change of variables as being a ca/nomcdl tra^foTmMion. 
Transformations of variables, having this desired property, can bo 



§14 CANONICAL TRANSFORMATIOlTS 33 

carried out for a system of / degrees of freedom with the help of an 
arbitrary function F dependmg on / of the old and / of the new 
variables, together with the time t if desired; and such a function may 
be called the gen&raiing function for the transformation. Such generat- 
ing functions are ordinarily chosen in one of the four forms 

F{pi,qi,t) or F{pi,fi,t), 

and the transformation equations obtained will then have a form which 
depends on this choice. We may illustrate the procedure for obtaining 
such equations of transformation, making use of the first of these four 
forms, and then give a statement as to the results which can be obtained 
starting with the other forms. 

Choosing the generating function in the form F{qi,qi,t), where the 
symbols q^ and q^ are taken as representing the sets of old and new 
coordinates, we then assert that a canonical transformation of variables 
can be obtained by requiring the validity of the relation 

where the summations 2 ar© to be taken over all / degrees of freedom i, 
and the last term is to be taken as signifying the total rate of change 
with time in the generating function F. 

We may first show that a relation of the form (14.4) would be suflB- 
cient to determine the desired equations of transformation. Here we 
encounter two cases, according as the transformation is complete 
enough so that there are no identical relations connecting the old and 
new coordinates qt and’g^, or is more limited so that the new coordinates 
are not completely independent of the old. In the first case, assuming 
the new coordinates independent of the old so that there will be no 
necessary connexion between the values of the qfe and the j^’s, and 
noting that in any case these generalized velocities are in each set 
independent among themselves, we then see tljat the general validity 
of the above equation can only be secured provided we satisfy the 
2/+1 individual relations 

_dF(qi,qi,t), 

2 ’* 

( 14 . 5 ) 

Hiq„Pi,t) = 


3595.25 



34 THE ELEMB’NTS OF CLASS ICA F, MECHANICS Cli.i|.. 11 

In the second case, assuming the existence of idotitical reiali.nis helween 
the old and new coordinates of the formf 

where the subscript r denotes a particular one of <li<*se relations, if is 
evident that the validity of the equations (U.l) and (ll.ti) taken 
together is to be secured by the individual reinticuis 


dcff 




+ 2 k' 


QMs^Hi) 

fVi 


(11.7) 




where the quantities A,, are undetermined constant inidtipliers and the 
summations 2 taken over all the identical relations r wiiieh exist. 
In the first case it will be seen that the eipiations given by the first 
two lines of (14.6) can be regarded as the desired equations of trans- 
formation, since they could be solved at will either ft*r the tjf nn*l p, in 
terms of the and Pi or vice versa. In tho second casi* the first two 
lines of (14.7) together with (14.6) are enough to deferinitie flu* r],, ft(, 
and A,, in terms of the qi and 

We must now show that a relation of tho form (M. l) is not only 
sufficient to provide transfonnation equations, but will also guarantt'o 
that the transformation actually is a canonical one. Tt> jIo this let tis 
consider the result of multiplying both sides of equation (M.4) by dt 
and integrating over a given time-interval to We thus obtain 

(M.«) 

Let us further now consider the result of performing a variation on 
both sid^ of this equation corresponding to an altered path for the 
system hut to the same time-interval to In carrying out this 
variation we shall replace increments in the velocities sue.h as JSr/, by 
d{8qi)jdt and then treat by the familiar method of partial integration ; 
we shall not impose any restrictions, however, on tho values of ami 

f The treatment can also be extended to tho case whoro tho Q. doiinntl oxjilicitly 
on the time. ‘ 



§ u 


CANONICAL TRANSFORMATIONS 


35 


8(]i at the limits corresponding to and We then obtain, as will be 
readily verified, 



(14.9) 


And since the first term on the left-hand side of this equation is equal 
to zero on account of the assumed validity of the canonical equations 
in the original variables, and the last term on the left-hand side taken 
together with the last two terms on the right-hand side will all cancel 
out, in accordance with (14.6) or in accordance with (14.6) and (14.7) in 
the two cases treated, we now obtain the result^ 

/ 1 [(f -fj -(t+D H = »■ <**■*“> 

h 

Finally, since the 8g,£ and Spj would be arbitrary, we now see that this 
can only be satisfied if we also have the equations of motion in the 
canonical form, _ _ 

t-i t=-i- 


holding in the new variables. Thus indeed we also find that the relation, 
expressed by (14.4), does determine a canordcal transformation. 

It may be remarked that the necessary and sufficient condition that 
a transformation canonical may be put in the often con- 

venient form 

= = {Pi>Pi} = 0, (14.12) 


t The first term in (14.9) which wo took sis equal to zero on account of the validity 
of the canonical equations of motion can also be written in tlie form 

tg 

8 J = 0, 

h 

provided we now take the variation 8 as one in which the time of transition is not to 
be altered and the configuration at times and are now to be regarded as the same 
on the actual and varied paths. Since the integrand in this equation is equal by (10.3) 
to the Lagrangian L for the system, although expressed in a different manner, the 
equation is often spoken of as the modified Hamilton’s principle. Hence, comparing 
with (14.10), we can also regard a canonical transformation as defined by the property 
of leaving the modified Hamilton’s principle invariant to the change in variables. 



11 


36- THE ELEMENTS OF CLASSIOAL MKCFt AXIfS 

where the q and f are regarded as functions of the q and /< in cvaliiaUng 
the above Poisson brackets as defined by (1 1-d). 

As already mentioned, we could also olifaiii oUht forms for the 
equations of transformation corresponding to a canonical cbaiigc of 
variables by choosing other forms for the gimcrating function h’. Tin* 
methods of proof are entirely similar to those given above.! Kor c<.n- 
venience we give the following table, which lists, umier each of the four 
principal forms for the generating function b\ tlu' corresjatnding forms 
which will be assumed by the equations of transformation, in tiie 
absence of identical relations between the variables on wJiieh t' rlepends. 


F = 

dF 

F = F{qpfi,t) 
dF 


F{PiHht) 

__dF 

F 

FiPi. Pi, 0 
dF 

Pt = 

dqi 

Hi 


dPt 


'7b 

Pi = 

dF 

_ dF 

Vi = 

dF 


rF 

Hi 


Hi 

•h 

•Pi 

H = 

ff 

rr H 
^ = ST 

// = 

B-% 

I! : 



(M.13) 

In naing these equations to secure canonical transformations, it is to 
be remembered that the generating function F ean be taken as an 
arbitrary function of its arguments, so that an infinifi* variety of 
canonical transformations can be obtainoil. It is convenient to havo 
the equations of transformation in their different forms, since a given 
problem may be more easily treated with one form than anothi*r. Kor 
example, if we are interested in a transfonnation that involves identii’al 
relations between the old and new coordinates and f/,, instead of 
using the first form of F and allowing for the klontical relations by using 
the more elaborate equations (14.7), we may bo able to avoid complic-a- 
tions by changing to the second form of i’ provided that there are no 
identical relations between the q^ and fif The eijuations firove useful 
for a considerable variety of different problems, j We shall ourselves 

For example, in the case of tho socond form of gntiomfiiis fmiclimi /'’('/c /'c 
a canonical transfonnation can bo obtained from tho equation 

By similar methods to i^ose used above, it con readily bo shown that »hw Krtufitiim 
leads to the tra^formation equations given in the socond column of (i 4. 13), and that 
Ah© transfonnation does have th© dosirod canonical character* 

J See, for example. Born, The Mechanics of the Atom, trunHlution fiy Fwh<»r anti 
Hartree, London, 1927. 



§14 CANONICAL TEANSPOEMATIONS 37 

1)0 specially interested in using the second form of the equations given 
by (14.13) for discussing two different general methods of integrating 
the equations of motion for a system. We now turn to a consideration 
of the first of these methods. 

15. Integration of the equations of motion by transformation to 
cyclic coordinates. Hamilton’s characteristic function 
In treating the behaviour of mechanical systems with the help of 
the canonical equations of motion, we may encoxmter cases where the 
expression for the Hamiltonian of the system does not contain one or 
more of the coordinates which are being used. For example, the 
Hamiltonian may be of the form 

H = H{q ^ ... qf, Pj ... Pf, t), (15.1) 

where the coordinate is missing. In accordance with the equations 
of motion in the canonical form, we should then have for the rate of 
change with time of the corresponding momentum Px 

dt dqx ’ 

and hence at once obtain px = const. 

as one integral of the equations of motion. 

The angular orientation around the axis of rotation of a freely 
rotating body provides a simple example of such a coordinate, the 
Hamiltonian expression for the energy of the body being independent 
of the value of this angle, and the corresponding (angular) momentum 
being constant. Coordinates which do not enter into the expression for 
the Hamiltonian are hence often called cyclic coordinates. 

The above specially simple character of cyclic coordinates suggests 
a method of treating the equations of motion for a system by making a 
canonical transformation to new variables such that none of the new 
coordinates will appear in the altered expression for the Hamiltonian 
and hence will all have the so-caUed cychc character. In the case of 
conservative systems this is then found to provide, at least in principle, 
a complete solution of the equations of motion. 

Considering a conservative system, and making use of the second 
form of generating function given in (14,13), we can write equations of 
transformation in the form 

= = H{q„p,) = S(qM. (16.2) 



H 


(ir,..!) 


38 THE ELEMENTS OF CLASSIFAL M FA'Il A N i t ’S 

Substituting the first of these equations into tin* lliinl. ami n-qniiing 
the altered Hamiltonian to bo independent of the new coordinates. \v«> 
can then write 

J dF dF 
"■ dq^ ■” % 

as an equation whose solution would provule a generating fiinelion 
F{qi,fi) which would give a canonical transforuiut ion to new variables 
such that none of the new coordinates f/j woidd appear in the aller4Hl 
Hamiltonian. 

Since equations of motion in the canonicial torni w<iuld st ill apply in 
the transformed language, we can now write 

dpi ^ _ Q 

~di 0^1 ~ dt iiif 

for the rate of change of the new inonienta with tina*. .And thus write 

... f>t~- ''(f (1 •*!..">) 

for the values of the new momenta where the <juantiti«‘s v, \f are 

constants. This then makes the altoied Hamiltonian a fuiu'tion «d" the 
constants a,, g = H(ai ... «,). { I '••<1) 

Hence, again maJdng use of the canonical oquatiems of uuditai. W4* «-an 
■ dfi dH dq, i>ii 

ra — = W, ... -i/ w, 

dt doll dt tiCtf ^ 

for the rate of change of the new coordinates with time, wlu'w* n», 

would themselves be constants. Or, on integrating, w»‘ have 

q^ = ... <2/ 1 Pj, (lo.s) 

where j8i,...,j8/ are constants of integration. Wo arc thus pnivithal with 
a solution of the equations of motion for a conservative system liy a 
transformation to new coordinates which <lc{»6nd linearly on tin* linuu 
The above treatment has made no siMJcification as to the form «if llio 
generating function F except that it shall ho sixtli as to mukcf t ht« 1 rans- 
formed Hamiltonian H depend in some way solely on the new inonuutt a. 
A specially simple case, which makes the transformctl Hamiltonian 
itself equal to one of the new momenta, arises when wo (iluxwe. Hamil- 
ton’s so-called cJutracteristic function 8 for the system as tlie generating 
function. This function 8 is defined as the general solution of the so* 
called Hamilton- Jacobi partial differential equation 




JC 





§ 15 HAMILTON’S CHARACTERISTIC FUNCTION 39 

and will be of the form 

S — S{qi ... qj, E, aj ... a/)+const., (16.10) 

where the / constants of integration are given by together with 

the arbitrary additive constant. Using this expression as giving the 
generating function F in equations (15.2) with E,oi 2 ,...,ocf as the new 
momenta, we then have the transformed Hamiltonian, 

S==: E, (15.11) 

directly equal to one of the new momenta which is itself the energy E 
of the system. And since the canonical equations of motion still hold 
in the new variables, our previous solutions (16.6) and (15.8) now 
reduce to the form 

fh = Ps = «2 - ft = (15-12) 

and (Ji = tfa = ^2 ... <], = (15.13) 

since 'dUj'dE becomes unity and aij become zero. 

From the present point of view the problem of integrating the equa- 
tions of motion for a conservative system, then, reduces to the problem 
of obtaining a general solution for the Hamilton-Jaoobi partial dif- 
ferential equation (15.9). In principle we are thus provided with a 
method of completely solving for the motion of any conservative 
system. In practice, it is to be noted, however, that the transformed 
coordinates and momenta q^ ... q^, E, 0 : 2 ... «/ may have a very com- 
plicated relation to any actual variables which we should wish to use in 
describing the behaviour of the system. It is also to be noted that the 
solution of the Hamilton-Jaoobi equation may be impracticable unless 
it is possible to apply the so-oaUed method of separation of variables. 

16. Integration of the equations of motion by transformation 
to constant coordinates and momenta. Hamilton’s principal 
function 

In concluding the chapter we may also consider another method of 
integrating the equations of motion by transforming to new variables, 
using Hamilton’s so-called principal function as the generating function 
for the transformation. The method is applicable both to conservative 
and to non-conservative systems, and leads at least formally to new 
coordinates and momenta aU of which are constants independent of 
the time. 

Taking H{q^ ... qp ... pp t) as an expression for the Hamiltonian of 
the system of interest in terms of the original coordinates and momenta 



40 


THE ELEMENTS OF (!LAS.SH'AL MKClIANH'f 

and the time, Hamilton’s principal funrf inn 11’ for llio nysf <'in ina\- lie 
defined as the general solution of the partial difTorentia! rquaiion 

„/ dW dW \ , f'lr 
The desired solution will then be of ihc form 
with /+ 1 arbitrary constants. 

For the special case of a conservative .sysfein, llie ta|tialion fiofiiiiiig 
the principal function will reduce to 

Tj( 

and it is then evident that wo can take the soiiitioii In l«* of ihe fnriu 

W==S~El, 

where S again denotes Hamilton’s ebararteriMtb’ fuiictinii for llic 
system, as defined by the Hamilton-.) aeobi etpialion 


' %/ f'f’ 


0, 


(Hi.:)) 




(in.r.) 


Let us now regard Hamilton’s principal function ns the generating 
function for a canonical transfonnation of the second t.V(H', listed in 
(14.13), with the new momenta equal to the constants n,...., r»^. in 
accordance with (14.13) we then have the original tmmvnUt given by 


the new coordimtea given by 




«?£’ 

0a/ 


(10.7) 


and the new Hamiltonian given by 

jS’==0. (1«.H) 

Hence, since the canonical equations of motion wouhl still ht>hl in the 
new variables, and the new Hamiltonian has the value swro Indepndent 
of the new coordinates and momenta, the equations of motion in the 
new variables would assume the specially simple form 

qi = ^i and (lO) 

where the a’s and ^’s are arbitrary constants. Hence, in principle at 
least, we are thus provided with 2/ integrals of the equations of motion. 
It is of interest in coimexion with the foregoing to consider the 



§ ]<'> 


HAMILTON’S PRINCIPAL FUNCTION 


41 


behaviour of a moving surface which can be defined, in the space 
corresponding to the coordinates qi,..., q/, by setting the principal func- 
tion W equal to a constant, in accordance with 

W{qi ... q,, «! ... ocf, t) = TfS, (16.10) 

where otj,..., and are constants which determine a class of motions 
of the system. Tlio surface defined by (16.10) would evidently naannipi 
successively changing positions in the qi,..., q/ space as time proceeds. 

Por example, in the case of conservative systems, where we can use 
IP = 8—Et as given by (16.4) to express the principal function W in 
terms of tho characteristic function S, the position of the surface could 
be taken at any time t as agreeing with the surface defined by 

8{qj , ... qf, E, aa ... oc,) = W^-{-Et, (16.11) 


whore tho energy E together with and are f+1 constants 

determining the class of motions under consideration. 

Returning to the general equation for our moving surface (16.10), 
let us now pick any point Ji,...,?/ lying on the surface at time t, and 
consider tho vector that could be erected at this point with the com- 


ponents 8W ^ m 

Wx W2 Wf' 


(16.12) 


This vector has two important properties. In the first place, it will be 
seen, in accordance with (16.6), that its components are equal to the 
momenta which systems of the class under consideration would 

have when their coordinates qx,—,qf are those for the point taken. In 
the second place, it will be seen from the values for its components 
given by (16.12) that the vector would be normal to the instantaneous 
surface under consideration. Hence the surface defined by (16.10) has 
the useful property of assuming successive configurations and positions 
such that a normal erected at any point on the surface would have 
components proportional to the momenta of a system arriving at that 
point with a motion of the class specified by the /-f 1 constants 

This result becomes specially simple and interesting if we take a single 
particle of mass m as the system to be treated, and use its Cartesian 
coordinates », y, a as determining the space in which the surface defined 
by Tf = const, moves. The normal at any point on this surface then 
has components proportional to the ordinary components of momenta 


dW . BW . 8W ^ . 


(16.13) 


3595.25 


a 



42 


THE ELEMENTS OF CLASSICAL MKCIIANK'S 


tl 


for the particle and hence lies in the same direction ns f he vclocily wit h 
which a particle of the class considered would puss thntii^h tin* f«iint 
t.a.lrftTi Hence the moving surface and the system of noriuals which it 
determines might now be considered as having the imt lire of n moving 
wave-front which determines a system of w.ys along which n class of 
particle motions can take place. It should he not<‘d in this connexion 
that the velocity of the wave-front and that of the associatisi particle 
motion along the rays would not in general he the same. 

In concluding this section on the properties of Haniilton's jirincipal 
function for a mechanical system, it may be well to emphasi/.e tfie 
distinction between the two different but related roles which we I’an 
regard this function as playing in tlie explanation of inechanh-al pheno- 
mena. On the one hand, we can think of the prini'ipal function as 
making possible the integration of the eciuations of motion, when it is 
used as a function for transforming to new coordinates and momenta, 
which are all of them constants. This was the ]ioint. of view of .iacobi. 
On the other hand, we can think of the principal fuiudion ns making 
it possible to obtain a description of mechanical motions in terms 
similar to those of wave-front and associated rays as used in geometrical 
optics. This was the point of view of special interest to iiamilfon in 
the original development of the ideas we have l«'.en dismissing. 1 1 may 
also be remarked that this latter point of vie.w was speidatly suggest ive 
to Schroedinger in his original method of developing a wave meciianies 
for treating quantum phenomena, his idea lieing that a complete wave 
mechanics should be related to ordinary mechanics in the same way 
that a complete wave theory of optics is related to geometrical optics. 



in 

STATISTICAL ENSEMBLES IN THE CLASSICAL MECHANICS 
17. Ensemble and phase space 

In the preceding chapter we have developed the principles of the 
classical mechanics, and are thus provided with methods for treating 
the behaviour of any given mechanical system of interest as it changes 
with time from one precisely defined state to another. We are now 
ready to begin our consideration of the statistical methods that can be 
employed when we need to treat the behaviour of a system concerning 
whose condition we have some knowledge but not enough for a com- 
plete specification of precise state. For this purpose we shall wish to 
consider the avera<ge behaviour of a collection of systems of the same 
structure as the one of actual interest but distributed over a range of 
different possible states. Using the original terminology of Gibbs we 
may speak of such a collection as an ensemble of systems.li 

In order to investigate the behaviour of such ensembles of systems 
it is convenient to have a quasi-geometrical language which can be used 
in specifying the state of each system in the ensemble, and in describing 
the condition of the ensemble as a whole. For this purpose, corre- 
sponding to any system of / degrees of freedom, we can construct a 
conceptual Euclidean space of 2/ dimensions, with 2/ rectangular axes, 
one for each of the coordinates Ji,..., q/ and one for each of the momenta 
whose values would determine the state of the system. 
Again using the terminology of Gibbs, who designated the changes 
in state of a system as being changes in phase, we can speak of such a 
conceptual space as a phase space for the kind of system under con- 
sideration. 

The instantaneous state of any system in the ensemble can then be 
regarded as specified by the position of a representative point in the 
phase space, and the condition of the ensemble as a whole can be 

*1* Tho terminology in this book agi’ees in general with the original terminology of 
Gibbs (loo. cit.). The word system is used, as in mechanics and as in thermodynamics, 
to denote the physical object of interest. Such a system may, of course, be composed 
of a large number of elements such as particles, atoms, molecules, electrons, modes of 
vibration, or what not, but this is not involved in the fundamentals of the theory. 
A collection of such systems which is studied from a statistical point of view is called 
an ensemble of systems. 

The terminology of Fowler (loo. cit.) differs in that the word system is applied to the 
molecules or other elements composing the object of interest, and the whole is called 
an assembly of systems. This terminology would be clumsy for our purposes, however, 
since wo should then have to speak of ensembles of assemblies of systems. 



U STATISTICAL ENSEMBLES IN THE CLASSICAL MKCUANtCS llf 

regarded as described by a ‘cloud’ of such re])r(*.senlutiv»' points, one 
for each system in the ensemble. The beliavioiir of the msojiitde ns 
Hmft proceeds can then be associated with i.lio 'sln-uining' motion tif 
hhia cloud of points as they describe trajectories in tiie phase space 
in accordance with the laws of mechanics. The rejiremaitative pitinis 
for the different systems are often spoken of ns phme. ptihit.<i. Since the 
position of a phase point completely determines the state of the «sirre- 
sponding system, it will be noted that the motion of any piiast' point, 
is an tmambiguous function of its instantaneous positicni. 

A number of remarks as to the nature of siuih ensembles and pbasi* 
spaces may now be made to assure complete understanding. 

It is to be emphasized that the various systotns which compose an 
ensemble are not to be regarded as interacting with ejich other, hut 
each as carrying out its own independent behaviour in af-cordanee with 
the laws of mechanics. To be sure, at a later stage of development, 
Chapters XIII and XIV, when we study the nat>m' of heat transfer 
to or from a system, we shall wish to consider ensembles whii^h i-an he 
regarded as constructed by placing members of one ensemhb repre- 
senting a given system of interest in thermal contact with the memlH'rs 
of another ensemble representing a heat reservoir; hut in the ahsenee of 
specific statement to the contrary, each system in an onscnnhle is to he 
regarded as carrying out its own independent motioji. In partimdar it 
should be emphasized that a system compo8o<l of a larg<i number of 
similar interacting elements or subsystems, such as a gjw eompowHl of a 
large number of similar molecules colliding with each other, is not Iw 
confused with an ensemble of independent systems. Inde<*d, in using 
statistical mechanics to study the behaviour of such a gas, it will lie 
necessary to consider an ensemble of independent systems ea«h «>f whlelj 
will itself be a separate sample of the whole gas of intenwt. 

In the case of systems composed of a largo number <if indivhiuai 
elements, such as molecules, it is evident that the coordinates atui 
momenta, which could he assigned separately to the itulividiiat nuile- 
cules, can, if desired, be taken as furnishing tiie k>t»d eulleel kjn of 
coordinates and momenta to bo used for the system as a whole. Kurtlier- 
more, since we could construct a phase space for each itidivkluai mole- 
cule with axes corresponding to its own coordinates and momenta, it 
is evident that we could then also regard the collection of phase H{>aecs 
for the individual molecules as providing the total phase space for the 
system as a whole. 

In this connexion a terminology which is sometimes convenient has 



§ 17 ENSEMBLE AND PHASE SPACE 45 

been introduced by Ehrenfest.f The phase space for the system as 
a whole (gas) is called a y-space, and the phase space for any individual 
kind of element (molecule) contained in the system is called a {i-space 
for that species of element. Using an appropriate y-space, the state of 
any system can then be completely specified by the position therein 
of a single representative point. Or using a jn-space for each kind of 
molecule involved, the condition of the system may also be described 
by giving the numbers of representative points for the component 
molecules which lie in the different regions of the ju,-space for each 
species of molecule. The latter description of condition is, of course, 
less complete than the former, since it makes no distinction between 
different molecules of the same kind. By interchanging such-like mole- 
cules different precise states could be obtained, all of which would 
agree with the condition described by merely specifying the numbers 
of molecules whose representative points fall in different regions of the 
appropriate ja-space. 

Returning to the phase space for a system as a whole, it is also some- 
times convenient to regard it as constructed out of the configuration 
apace that corresponds to the set of coordinates chosen for the system 
taken together with the momentum space that corresponds to the set 
of momenta conjugate to those coordinates. These two spaces regarded 
in combination then give the complete phase space which is con- 
veniently used in specifying the state of the system. 

It would, of course, also be possible to consider a conceptual space 
constructed out of a configuration space for the system taken together 
with the corresponding vdocity apace. This procedure would also furnish 
a combined space such that the state of the system could be completely 
specified by the position of a representative point. Nevertheless, when 
we come to the derivation of Liouville’s theorem, we shall find that 
the phase space, with rectangular axes corresponding to each coordinatei 
and momentum, has such specially simple properties that it is the 
convenient one to choose. 

18. Density of distribution in the phase space. Averages for the 

ensemble 

In the case of an ensemble of systems it is evident, m accordance 
with the preceding section, that the state of each individual system in 
the ensemble could be precisely specified at any instant by giving the 

t P. and T. Bhrenfest, ‘Meohanifc der aus sehr zahlreiohen diskreten Teilen be- 
stehenden Systeme’, JSncykl, d, math. Wise. iv. 2, ii, Heft 6, Leipzig. 



46 STATISTICAL ENSEMBLES IN THE CLASSrCAr, MKfllANirS n.iip. Jll 

position of the corresponding rcpresciiliitiv'c poiid in n snitiililc ]*ii,‘is«> 
space. In using ensembles for statistical piir|MiscH, however, it i.s to 1,^ 
noted that there is no need to maintain distinctions Itct ween individual 
systems, since we shall be interested merely in tiie nundters of systems 
at any time which would bo found in the <IilTer<-nt slates that corre- 
spond to different regions in tlie jdiase .space. Monsnei-. it is .also to 
be noted for statistical pui’poses that we shall wish to u.s»» ensembles 
containing a large enough population of He])arnte imunbers s«» that the 
numbers of systems in such different states can he regarded as ehungiiig 
continuously as we pass from the states lying in one region of the pliase 
space to those in another. Hence, for the purposes in view, it is evident 
that the condition of an ensemble at any time i-an he I’c'garthsl as 
appropriately specified by the density p with wliitdi ref>resentative 
points are distributed over the phase spaiic. 

This density of distribution 

p = p{qi...qj,Pi...P/J) 

is to be taken, in the case of systems of f degrees of freethtin . ns a fuiict ion 
of the 2/ coordinates and momenta wliich correspond 

to the different coordinate axes in the phase spa<'e. It is nls<i t<t Iw 
taken as a function of the time t, since the density would, in general, 
change with time at any point in phase 8i)aco owing to the motion of 
the phase points as the coordinates and momenta for the corres[ionding 
systems change their values in the manner prescrilsnl by tin* princi|»les 
of mechanics. As a convenient abbreviation wc may use 

p^piq>p,t) {««•-) 

as an expression to indicate this dependence on coordinatt's, nionienta, 
and time. 

The quantity p is then to bo understood as <lctennining t he number 
of systems BN, which would be found at time t to have coordinates ami 
momenta lying in any selected infinitesimal range Sf/, ... iv/, S/*, ... Bp,, in 
accordance with the equation 

BN = p(q, p, t) 8g-i ... Bq, Bp , ... 8/;,. ( ! «.a) 

We assume a large enough total population of systems m that p ami 
BN can be regarded with sufficient approximation as changing con- 
tinuously as we go from one region in the phase spaito to another. 

By integrating over the whole of phase space, wo can write 

■^ = / - / p{q,P.t) dqi...dp, (18.4) 

as an expression for the total number of systems N in tho cnHcinhlc or 



DENSITY AND AVEBAGB 


47 


§ 18 


phase points in the phase space. For the physical interpretation of this 
formalism it is essential that the integral (18.4) for N converge; we 
shall in general be concerned only with densities p which have this 
property. Equation (18.4) then gives 

= 

^ j - j p(q,P,t)dqj_...dpf ( 18 . 5 ) 

as the probability per unit extension in the phase space that the phase 
point for a system chosen at random from the ensemble would be found 
at time t to have the specified values of the j’s and p’s. With the help 
of this expression and a knowledge of the actual dependence of p on 
the q’s and p’s, this then makes it possible to calculate any desired 
kind of average, over all the systems in the ensemble, of any quantity 
which depends on the coordinates and momenta of the systems. For 
example, if we consider a mechanical quantity F{q, p), which charac- 
terizes systems of the kind under consideration, its mean value for all 
the systems in the ensemble would be given at time t by 

^(S> 2») = ^ J - J ^(S>P)P(S>P> *) -dPt 


j - J <ki...dpf 

j...jp{q,p,t)dqi...dpf 


(18.6) 


Wo use the double bar = to indicate a mean over all the systems of 
the ensemble, since we shall wish to reserve the single bar “ to denote 
the mean or expectation value for a quantum mechanical quantity in 
the case of a single system. 

Instead of taking the quantity p precisely as above, it is sometimes 
convenient to regard it as normalized to unity, in accordance with the 

equation i = J ... (18.7) 

The quantity p itself then gives directly the probability per unit volume 
of finding the phase point for a system picked at random from the 
ensemble in different regions of the phase space, and the expression for 
the mean value of any function F{q, p) of the coordinates and momenta 
reduces to the simpler form 

= /•••/ dqj_...dqf. (18.8) 

In the original development of Gibbs these two senses in which p may 
be used were distinguished by the separate designations, density in 
phase D and coefficient of probability of phase P. Nowadays, however, 



48 STATISTICAL ENSBMBLKS IN THK CLASSICAL MECHANICS Cl,„p. HI 

ifc is usual to regard the two possibilities as merely eorrespfimUng tn 
two different possible inodes of normalizing p. 

In order to avoid confusion we shall adopt Iho ])raetiee of always 
naiTig the density p in the obvious sense lii-st given above, unless a 
statement is made to the contrary. It seems de.sirable fo poiiif out I he 
two possibilities, however: in the first place since t here is no univer.sally 
accepted convention in the matter, and in the seeojwl pla«-«> since the 
corresponding quantum mechanical quantity -the density matrix p„„, 
to be introduced later — ^is most frequently taken as normalized to unity, 

19. Liouville’s theorem for the change in density with time 

Having seen in the foregoing section that the <’on<Ution of an en- 
semble can be appropriately described at any instant by the density 
of distribution p of representative points in the phase, sjtaee. we may 
now inquire into the changes, which take iilaco in this quantity with 
time, as the representative points desexilie trajectories in the jihase 
space in accordance with the principles of mechanics. We may lirst 
consider the rate of change of p with time at any given ]»oint in the 
22 )-phase space on which we fix our attention. 

To treat this, let us consider, at any point in the 

phase space, the differential element of extension that would Iw defined 
at that point by For the number of phase piunts 

inside this element at any instant, wo can write 

SN = pSq^...Bq,8p^...8p,, (IH.l) 

where p is the density at the point and instant under crmsirleriition. 
This number will, in general, be changing with the time, how'over, since 
the number of phase points entering the element of hyper volume 
through any ‘face’ will, in general, be different from the numlier which 
are leaving through the opposite ‘face’. Thus, if wo consider the two 
faces perpendicular to the gj-axis, which are lotuited at and g, } Sg,, 
we could write for the number of phase points entering the first of 
these surfaces per unit time the expression 

where p and are the density and indicated component of velotsity 
for representative points at qi—qfPi—P/, and should then havo as tlio 
corresponding expression for the number of phase points leaving the 
opposite face 

^ ^ ^ 


( 19 . 3 ) 



LIOUVILLB’S THEOREM 


49 


§ 19 


where in both expressions higher order differentials have been neglected. 
Combining these two expressions, again with neglect of higher order 
differentials, and summing up over all such resulting terms i for the 
/ coordinates and for the / momenta, we then obtain 


(1{ZN) 

(It 




(19.4) 


as an expression for the rate of change with time in the number of 
representative points 8N lying in the specified element of phase space. 

This result, however, can be immediately simplified. In accordance 
with the equations of motion in the canonical form (10.8), we can write 


8H , . 8H nar\ 

S, = _ and (19-5) 

as expressions for the rate of change with time in the coordinates and 
momenta of a system of the kind under consideration. The quantity 
H appearing in these expressions is the Hamiltonian for the system as 
a function of the coordinates and momenta, and hence, since the order 
of differentiation is immaterial, we obtain 


a = or 2(|-‘+^)-0. 

Hi H>i V ^Hi 


(19.6) 


which leads to a cancellation of the first terms on the right-hand side 
of (19.4). Furthermore, dividing (19.4) through by the element of 
extension Sji ... ... 8p/, we evidently obtain the rate of change in 

the density itself at the point of interest, so that we can now write our 
result in the desired simple form 

We use the symbol of partial differentiation with respect to time to 
indicate that we are fixing our attention on a given stationary point 
in the phase space. 

The result given by (19.7) proves to be of fundamental importance 
for statistical mechanics. It is often spoken of as Liouville’s theorem, 
since the so-called equation of incompressibility (19.6) on which it is 
based is of a form considered by that investigator, f The result can be 
expressed in a variety of different forms which prove convenient under 
different circumstances. 

t Liouville, Joum. de Maih, 3, 349 (1838), 

H 


3B95.26 



60 


STATISTICAL ENSEMBLES IN THE CLASSICAL MECHANK'S 


Chap. HI 


Substituting the expressions for and Vi given by (lU.A) iido (ltt.7), 
we write Ldouville’s theoi’em in the form 


/dp\ _ "O 1^ ^ f '/ A 
~ V 


Or, noting the definition given by (11.3) for a Pokson hno-kcl. wc (•« 
also write this in the fonn 


'8p 

Si)a.P 






The rate of change with time, at a given point of phnw hi the 

density of distribution p{q, t) for an ensemble of sysf oins is t hns giv<*n 
by an expression of the same form— but with opposite sign as that 
for the rate of change with time in a function of the coordinates and 
momenta F{q, f) for a single system. Tiiis will ho of interest when we 
later consider the analogous quantum mechanical <‘xpn‘.ssions. See 
§ 81(c). 

Returning once more to (10.7), and transjiosing the snnitnation of 
terms to the other side of the equation, wc can also writi* I^ioiiv file's 
theorem in the form 


Hence, taking cognizance of the full dependence of the flensit y p{q, p, f) 
on the coordinates, momenta, and time, and noting that *// and /// are 
themselves expressions for the components of the vidocity with which 
a representative point would move through the phasi* space, we now 
obtain the simple result 

dt 


= 0 , 


(UUl) 


when we consider the rate of change of density in tfie noigitlaiurhood 
of any selected moving phase point instead of in tho neighlKiurhood of 
a fixed point in the phase space. This form of cx[>rcHHion may he called, 
in accordance with Gibbs, the principle of tho consemaiUm of dfimlif 
in phase. 

Knally, with the help of (19.11), we may obtain one further form of 
expression for Liouville’s theorem which will prove useful. For this 
purpose let us consider a region in the phase space 

j ... j dqj_...dpf, (19,12) 

which is taken small enough so that the density p can be regarded as 



§10 CONSERVATION OF DENSITY AND OF EXTENSION 51 

uniform over its extension. For the number of representative points 
inside this region we can then write 

8N = pBv = p j ... j dqi...dpf. (19.13) 

Let us now follow the motion of this little region through the phase 
space, allowing its boundaries to be permanently determined by the 
representative points originally lying thereon. Then, since no repre- 
sentative points can be created or destroyed on account of their cor- 
relation with mechanical systems, and since no representative points 
can cross the boundaries on account of the unambiguous determination 
of mechanical motions, we must have the result 

In accordance with (19.11), however, the term containing dpjdt is equal 
to zero, since we are following a natural motion in the phase space. 
Hence, we then obtain the desired result 

^ J ... J dq^...dp, = 0. (19.16) 

It will be noted from the possibility of combining small elements that 
this final form of expression would evidently also apply to an exten- 
sion of any magnitude in the phase space provided its boundaries 
are permanently determined by the same selection of representative 
points. 

Again adopting the terminology of Gibbs, this last form of Liouville’s 
theorem may be called the principle of the conservation of extension in 
phase. In accordance with this theorem any extension in the phase 
space, with moving boundaries specified as above, would retain a con- 
stant ‘volume’ as time proceeds. Its ‘shape’, however, would, in 
general, change with time, and after a sufficient interval this shape 
might become so ‘filamentous’ as to extend into many different parts 
of the phase space. 

The simplicity of the results obtained in the above derivations de- 
pends on the choice of a phase space — constructed by the combination 
of configuration space with momentum space— as the apparatus for 
describing the condition of an ensemble. Principles so simple in form 
as those of the conservation of density in phase and the conservation 
of extension in phase would not appear if we used the language pro- 
vided by a combination of configuration space with velocity space. It 



62 STATISTICAL ENSEMBLES IN THE CLASSICAL MECHANICS Clmp. Ill 

is in large measure to Boltzmannt that wo owe Uio rot-ogni* inn of 
the importance of using momenta instead of vt'inrit ios in slalisiira! 
mechanical considerations. 

20. Invariance of density and of extension to canonical trans- 
formations 

In TwaViTig use of the foregoing ideas ns to phase .span's for i he i n«al - 
ment of systems and ensembles, wo shall usually n*gard the j-nordinafes 
and momenta, for the system of interest as a whole, ns provided by 
the total coUection of coordinates and niomenia, wlueh %vouid bn 
naturally assigned to the individual molecules ef)m|»o8ing the system 
in simple and familiar ways. It is evident, Innvt'ver, that there is 
nothing involved in the conceptual intro<luelion of a <//» [tha.se spaee to 
represent the state of a system, or of a density of pliase [loiiifs /» for 
the description of an ensemble, which implit's the use of any partietilar 
set of coordinates and momenta. Furthermore, it is evident that the 
validity of the various forms of Liouvillc's theorem derivetl in the last 
section depends solely on the assumiition of the e<|uations of imiiinn 
in the canonical form fpr the systems treated. Henee the foregoing 
methods and results would apply using any set c»f ennonieal variables 
for the kind of system of interest. 

The circumstance, that Liouvillc’s theorem wouht eontimu' to hold 
after any canonical transformation of varialdes, now makes it easy tf» 
show that the density of distribution at any seleetinl pf»inl in the [tiiase 
space would have a numerical value invariant to such eantinii-al trans- 
formation. To investigate this let us consider t wo difTen'nt si'ts of 
canonical variables, say and ami at any st'leeted time /, let 
us use the symbols pi and to denote tho density of distribution, in 
the neighbourhood of any selected point in the plmsr' space, in the t wo 
languages respectively. Our problem, then, is to prove (he equality *»f 
P]^ and pj^. 

To show this let us now consider a reprosontetivc phase point I’arrylng 
out its natural motion and so chosen as to coincide in position at. t ime 
with the arbitrary point which we have soloe.tod in the phase spni^e. 
At time the density in the neighbourhood of this phase point W'ouki 
then be p^ and p^ in the two languages respectively. At a later time 
we may denote the density in its neighbourhood by pa and pj in the 

t See Boltemaam, Vorleaungm ■Qber ffasl/icwie, II. Toil, zweitnr, uiiverAminHer Ab- 
druck, Leipzig, 1912. 



§20 


INVABIANCE OP DENSITY AND OF EXTENSION 


03 


two languages, and in accordance with Liouville’s theorem in the form 
of the conservation of density in phase we must then have 


Pi — Pa Pi — p2- 


( 20 . 1 ) 


However, wo may now evidently introducef a third system of canonical 
variables which would coincide with the set at time and 
with the set at time Using similar notation to that above, this 
then gives us the relations 


Pi — Pit Pi — Pit and Pi — p^. (20.2) 

Combining with the second of the two equations (20.1), this then leads 

to the required result _ /on «j\ 

Pi — Pi- ( 20 . 3 ) 

Furthermore, combining with the first of equations (20.1), it also gives 
us the analogous result, for the later time 

Pi = Pi- (20.4) 


Wo have thus shown, as desired, the invariance of the density p at any 
arbitrary point and time to a canonical transformation of variables. 

The finding that the density in phase would be invariant to canonical 
transformations now makes it evident that the magnitude of any exten- 
sion in phase, with boundaries defined by a selection of definite phase 
points, would also have to be invariant to canonical transformation. 
To see this let us first consider an extension in phase small enough so 
that we could regard the density p as not varying appreciably within 
its boundaries. Again taking q^p^ and qtPi as two different sets of 
canonical variables, we could then write as expressions for the number 
of phase points in the extension 


and 


SN = p j ... j dqi ...dpf 
BN = p J ... J dqi...dpf 


( 20 . 6 ) 


in the two languages respectively. Since we have proved p = p, how- 
ever, and since we must in any case have BN = BN as exactly the 
same phase points are involved in the two cases, we obtain the result 

J ... J dqi ...dpf = J - J dqi ... dpf. (20.6) 


t Let F{q^f t) be the generating function for the transformation from the set to 
the set q^Pi* Then the function 

= X SiPi—' + “7 

i h — h 

would be a possible generating function for the transformation from to 



G4 


STATISTICAL ENSEMBLES 


TN THE rr.ASSlCAL MEt'H \\I«'S 


Ill 


Moreover, from the possibility of combining one <‘lfmon< <.f oxtenHioji 
with another, we see that this invariance to canonical lranslV»rniati«»n 
would have to hold for an extension in ()liase of any niagniludc jn-o- 
vided its boundaries arc objectively defined. 

By maifiog use of Liouvillo’s theorem in the form of » In* conservnfion 
of extension in phase, the above result enulii ai.s«> have !«*en proved hy 
a derivation similar to that which wo gnva'i for the invariance of Ihe 
density p to canonical transformation. In ad<li<ion, tin* residf (2».«) 
could also be deduced by a direct consuleration of the consequences of 
Tnn.Ving a Canonical transformation of variahles.t 
lvra.lriTig use of the above invariance of any e.vtf'iision in phase spaen 
I ... J dqi.-.d'pf, we now always have the possibility, wifiuatt irsulting 


effect on the numerical magnitude, of changing f<» a .system of canonical 
variables such that the q'» and p’s w<ud<l he cjmrdlnHlcs and momenfa 
for the various parts of the mechanical system in the ordinary .sense 
of the words. We then see that any extension in plui.s«' spac<r w<tnld 


have the dimensions 


f“)' 


(Mm> 

I ^ )' 


(20.7) 


that is, of action raised to a power equal to the number of degrees *>f 
freedom / of the system.^ 

This will be of interest in connexion with our later intro»hicfion of 
the quantum mechanical point of view. We shall then fiml that file 
exact specification of a quantum mechanical state (•ouhl furnish in- 
formation as to the values of a coonlinate q ami its conjugated momen- 
tum -p only within limits corresponding to the Heisenltei^ uncertainty 

“““ AjAp « h. (»,.») 


where h is Planck’s fundamental quantum of Hiuuhs in i\m 

case of a system off degrees of freedom, an exteiiHion in phaHo 
of the magnitude ^ r 

J •” j — (ilO.n) 

may be regarded as an approximate classical analogue for a preciwdy 
defined quantum mechanical state. 


t For a general treatment of integrals invariant to t*ati<mtr*al trnnHfftnnnltofn M«*n 
Poinoar4, M&hodes nouvellea de la micani^ie cMpj^U, voL lii* :i2 4* Piiria. IKIMI. 

See also Brody, Zeita.f. Phys* 6, 224 (1921). Starting with thn inviirianr^ urnxh^fiHton 
in the phase space it is possible to proceed in the reverso direction und thwivo 
theorem,^ The order of treatment given above is that of Oibha. 

. ^ t It will be noted that the numerical magnitude of an (ixtdUiKjon in phaMo, uithtntgh 
invariant to canonical transformation, would depend, of course, on the tmits chosen for 
mass, length, and time. 



( 65 ) 


21. Conditions for statistical equilibrium 

Leaving this digression as to the consequences of transforming from 
one set of variables to another, we may now return to our main line 
of development for the ideas of the classical statistical mechanics. In 
previous sections we have seen that the condition of an ensemble of 
systems at any instant can be satisfactorily described by the density p 
with which the corresponding representative points are distributed in 
the appropriate phase space; we have found that a knowledge of this 
density of distribution would be sufficient to determine average values 
for different mechanical properties of the systems composing the en- 
semble; and we have obtained (19.7) the simple expression 

@L=-?(K.4' 

for the rate at which this density would be changing with time at any 
selected fixed point in the phase space. 

We may now inquire into the nature of the distributions of p as a 
function of the q’a and p’s which would make p permanently indepen- 
dent of time at aU points in the phase space. Under such circumstances 
the probabilities of finding phase points in the various regions of the 
phase space and the average values for the properties of the systems 
in the ensemble would be independent of time, and we could describe 
the ensemble as being in statistical equilibrimn. 

Examining (21.1), we immediately see that a very simple method of 
obtaining statistical equilibrium would be to set up an ensemble with. 
p originally distributed uniformly over the whole of phase space. 

p{q,p) = const. (21.2) 


This would then make ^ ^ = 0, (21-3) 

dpi 

and hence, in accordance with (21.1), the uniform distribution would 
be permanently maintained. 

As a more general condition for statistical equilibrium, we could take 
p originally distributed as any function 

P = />(«) (21.4) 


of some constant of the motion a. for the systems considered. Since a 
would itself be a function of the coordinates and momenta with a value 


for any given system which would not change with time, we should 


then have 


doi (8a da .\ . 

--= > = 0 , 


dt 8pi 


( 21 . 6 ) 



66 STATISTICAL ENSEMBLES IN THE CLASSK’AL MltrH ASH'S r),n|). Ill 


provided and pi are rates of change wit h tiiin- for any actnal system 
of the Vind considered. Hence, since the <piatititie.s y, and /*, aji|H>anng 
in (21.1) are such rates of change for the natural motion of a system 
whose phase point is located at the jxisition in phase space under con- 
sideration, we now obtain 



dp Idix . . 


A) ■ <'• 


(21.ti) 


and see also under these circumstances that the original disirihiition 
would be permanently maintained. In the <;ase of a eonser\ ativ<* .sys- 
tem, the energy of the system is the most jiatnrai const ant of the 
motion to use in setting up ensembles whicii ar<i in |M*rmanent staf istical 
equilibrium. 

Ensembles which have the property of being in stat ist ical cipiilibrinm 
prove to be important both for the i-cpresentati^tn <tf .systians which are 
themselves in a condition of macitiscopic ecpiiiibriiiin. and also for 
obtaining an insight into the fundamental hypothesis of statistical 
mechanics which we shall introduce later. We now tarn in the ne.\t. 
section to a more detailed consideration of certain examples of en- 
sembles in statistical equilibrium. 


22. The uniform, microcanonical, and canonical enscmblea 
(a) The uniform ensemble. Wo have already seeji In the preee<ling 
section that an ensemble with its ropresentativi* phase imints <liH- 
tributed uniformly over the phase space wit h 

p = const. (22.1) 

would retain this distribution permanently. Kin-Ii an ensemble may be 
called a tmiform ensemlle. 

As a representative ensemble for making prtsUetions as to tfie be- 
haviour of a single system in a condition cornwpoiiding to a partial 
specification of precise state, the uniform ensemblo is nt»t of inteivst. 
At the most we could only say that such an onstMiiblu repri'sented an 
individual system concerning whose state wo hud no information what - 
ever. The uniform ensemble is of interest, however, in showing that 
there is no inherent tendency for the roprosentativo points to ‘<‘rowd’ 
into any particular region of the phase space, as wo hIuiII em[)iiaHi/.e 
in the next section. 

An important property of the uniform onsemblo aiiHCH from the 
invariance to canonical transformations which wo have foun<! in § 20 
for the density of distribution p in the phase space. In u<’cor<Iance with 



§22 


MICROCANONICAL ENSEMBLES 


67 


this invariance, we see that an ensemble which is set up with uniform 
distribution with respect to one set of coordinates and momenta would 
also be imiformly distributed, and iudeed with a value for p of the same 
numerical magnitude, using any other set of coordinates and momenta. 
Regarding a uniform ensemble as representing a system concerning 
whose state we have no knowledge, the above conclusion may be sug- 
gestively described by saying that we should be equally completely 
ignorant as to the state in all languages. 

(6) The microc^momcal ensemble. In the case of conservative sys- 
tems, with which we shall for the most part be concerned, the energy 
^ would be a constant of the motion. Hence for conservative 
systems, in accordance with the preceding section, an ensemble with 
the density p distributed as any function of the energy E would be in 
statistical equilibrium. 

A very useful ensemble of this kind can be obtained by taking the 
density as equal to zero for all values of the energy except in a selected 
narrow range E to E-\-ZE. Using the terminology of Gibbs, such an 
ensemble, specified by 


p = const. {E to E-\-hE), 
p = 0 (outside the above range), 


( 22 . 2 ) 


may be called a microcanonical ensemble. 

An ensemble of this kind can be regarded as obtained from an 
originally uniform ensemble by discarding all systems having phase 
points with positions that do not fall within the limits in the phase space 
that correspond to the energy range E to E-}-SE. Since no phase point 
could in any case cross over a surface of constant energy in the phase 
space, we thus obtain an immediate insight into the reasons why such 
an ensemble would remain m statistical equilibrium. 

The microcanonical distribution is often employed as giving a repre- 
sentative ensemble for predicting the properties of a system which is 
itself known to be in a state of macroscopic equilibrium with an energy 
in the range E to E+8E. Indeed before the work of Gibbs it was the 
only ensemble extensively considered. 

If we consider the possibility of making the selected range of energies 
8E narrower and narrower, we can regard ourselves as approaching 
the limit of a surface distribution of phase points all of which would 
correspond to systenas having precisely the same energy. The resulting 
distribution might be called a swface ensemble, and in the older classical 
statistical mechanics it was often felt that such a surface distribution 

3S9S.25 


T 



68 STATISTICAL ENSEMBLES IN THE CLASSICAL MECHANICS Chap. Ill 

was specially appropriate for treating conservative systems, since (clas- 
sically a system of interest would be regarded as having some j^orfcct ly 
definite value of its energy even if we did not know ])i’eciisely wliat 
that value was. For such a surface ensemble the surface densif y of 
distribution a would be given by 


const. const. 



where is the magnitude of the velocity with which a rej^resontativo 
point moves through phase space. There was seldom, if ever, any r<>al 
advantage, however, in going to this more complicated formulation, 

(c) The canonical ensemble. Another kind of ensemble of imi)ortanco 
for conservative systems may be defined by taking the density of 
distribution p as given by 

P = Ne~e , (22.4) 

where N is the total number of systems in the onsomhle, K is t lu* 
energy as a fimction of the coordinates and momenta, and tft and 0 are 
parameters, independent of the g’s andjp’s, whose values conipletn the 
description of the distribution. Such distributions were firat introduced 
by Gibbs under the name canonical msmlle. Since tho energy N wotiUi 
be a constant of the motion for conservative systems, these <listril«i- 
tions as defined by (22.4) would then satisfy the conditions for statistical 
equilibrium given in the preceding section. 

The distr^mMon parameters tjt and 6 appearing in (22.4) arc eonH<.anls 
having the dimensions of energy. Their values are evidently not in- 
dependent since by integrating over the whole of phase spa^re we obl ain 

N = j ... J p dgi ... dpf = N j ... j e 6 dq^^ ... dp/ (22.5) 

for the total number of systems in the ensemble. And this gives 

e-i>l^ = j ... j e-^l^ dqi — dp, (22.(i) 

as a necessary relation which must be satisfied by ^ and 0. By <lifrerm»t 
choices of values for these parameters, that do satisfy tho above nda- 
tion, we can then obtain different canonical distributions for a given 
kind of system which will prove useful for representing the same systam 
of interest in different macroscopic conditions. 

Under the circunastanoes ordinarily encountered in Htatistieal 
mechanical applications, the canonical distribution as defined by (22.4) 
is such that nearly all the systems in the ensemble have energies which 



CANONICAL ENSEMBLES 


69 


are very close to the average energy for the ensemble. This arises from 
the combined effect of two factors. On the one hand, in the case of 
systems of many degrees of freedom, the volume in the phase space 
increases very rapidly as we include regions corresponding to higher 
and higher energies, and this tends to depress the numbers of systems 
in the ensemble with less than the average energy. On the other hand, 
the form of the distribution law itself, with the negative of the energy 
appearing in the exponent, tends to depress the numbers of systems 
with energies much greater than the average. The combined effect of 
the two factors usually leads to a high concentration in the neighbour- 
hood of the average, as we shall see in more detail later (§ 141 (b)). This 
makes it possible to employ canonical ensembles for the representation 
of systems of interest which themselves have an energy which is pre- 
cisely or at least nearly precisely defined. In particular, as first shown 
by Gibbs, canonical ensembles prove to be specially appropriate for the 
representation of systems in a condition of macroscopic thermodynamic 
equilibrium. The justification for this use of canonical ensembles will 
be presented very completely from a modem quantum mechanical point 
of view in Chapters XII and XIII. 

The uniform, microcanonical, and canonical ensembles considered in 
this section all have distributions which are in statistical equilibrium. 
It will be appreciated, however, that ensembles having distributions 
which change with time can also be important and will indeed be 
essential for the representation of systems of interest which are not 
themselves in equilibrium. 

23. The fundamental hypothesis of equal a priori probabilities 

in the phase space 

We must now undertake a consideration of the frmdamental hypo- 
thesis, as to equal a priori probabilities for equal regions in the phase 
space, which has to be introduced into the classical statistical mechanics 
in order to make applications to situations of actual interest. Although 
we shall endeavour to show the reasonable character of this hypothesis, 
it must nevertheless be regarded as a postulate which can be ultimately 
justified only by the correspondence between the conclusions which it 
permits and the regularities in the behaviour of actual systems which 
are empirically found. 

The introduction of some kind of postulate as to a priori probabilities 
is always involved in the use of statistical methods for predictive pur- 
poses. In the case of statistical mechanics the typical situation, in 



60 STATISTICAL ENSEMBLES IN THE CLASSICAL MECHANICS Chap ITI 

which, statistical predictions are desired, arises when our knowledge of 
the condition of some system of interest is not sufficient for a apc<!ififati<m 
of its precise state. Under such circumstances we take rcoourso to a, 
representative ensemble of systems of similar structure to the one of 
actual interest, appropriately distributed over a vaiict.y of different, 
precise states, and then take the average properties and average i)e- 
haviour of the systems in this ensemble as providing our best c-stiniat es 
as to the properties and behaviour of the actual system of interest,. I n 
setting up such a representative ensemble wo shall, of coui-se, assign 
the systems of the ensemble to different states in such a way as to 
agree with our partial knowledge of the state of the actual system of 
interest. We need some postulate as to a priori probabilities, however, 
to guide us in makin g assignments of systems to states that, agree 
equally well with our knowledge as to the actual condition of the, 
system. This necessity for an additional postulafe arises nof, fVoin 
any incompleteness nor inexactness in the principles of the ela.ssical 
mechanics, when applied to the conceptual situations for whicli fliis 
discipline was immediately devised, but from tho iml e-x tension in 
theory which is needed when we are confronted by incompleteness and 
inexactness in the specification of states for tho systems wo wish to 
treat. 

For the above purpose we now introduce tho hypothesis of 
a priori probabilities for different regions in the phase space that conTi- 
spond to extensions of the same magnitude. By this wo mean that the 
phase point for a given system is just as likely to bo in one region of 
the phase space as in any other region of the same extent which corre- 
sponds equally well with what knowledge wo do have as to tho fton- 
dition of the system. Thus, if we consider groups of noighbotiring 
states that correspond to differently located regions in the phase space 
Bi, IZj, Bg, etc., having the extensions in phase 

= J - J dqi ...dp,, = J ... J dqi ... dp,, etc., (23. 1 ) 

• 2^1 lit 

W6 shall take the probabilities of finding tho system in thoso diffcrciiit 
groups of states as proportional to the extensions pro- 

vided our actual knowledge as to the condition of the Hystein is equally 
well represented by the states in any of the groups connidorod. For 
example, if aU we know about the state of a system is that its energy 
lies in the range E to and we consider different extensions in 

the phase space Vi, V 29 etc., which themselves lie entirely in tho 



§23 


THE FUNDAMENTAL HYPOTHESIS 


61 


range E to E-\-hE, we shall take these extensions as giving the relative 
probabilities that the system itself would be found on examination to 
be in the corresponding different conditions. 

As already emphasized, this principle must be regarded as a postu- 
late. Nevertheless, we can see the reasonable character of the principle 
if we consider the behaviour of the uniform ensemble which was studied 
in the last section. It was there shown, if we set up an ensemble with 
the phase points for its member systems uniformly distributed, with the 
same number in aU equal regions of the phase space without reference 
to location, that this uniform density of distribution would be per- 
manently maintained as time proceeds. We thus find that the principles 
of mechanics do not themselves include any tendency for phase points 
to concentrate in particular regions of the phase space. (This majr be 
contrasted, for example, with the tendency to concentrate which would 
be found in general if we used a combined configmration and velocity 
space for plotting the changing states of our systems.) Under the cir- 
cumstances we then have no justification for proceeding in any manner 
other than that of assigning equal probabilities for a system to be in 
different equal regions of the phase space that correspond, to the same 
degree, with what knowledge we do have as to the actual state of the 
system. And, as already intimated, we shall, of course, find that the 
results which can then be calculated as to the properties and behaviour 
of systems do agree with empirical findings. 

In concluding this section on the hypothesis of equal a priori proba- 
bilities for equal extensions J ... J dq^...d/pf in the phase space, atten- 
tion may be called to our previous finding, see § 20, that the magnitudes 
of extensions in phase space are invariant to canonical transformation. 
It will hence be appreciated that our hypothesis as to equal a priori 
probabilities holds equally well in the language of any set of canonical 
variables. 

Attention may also be called to our previous remark — ^see end of 
§ 20 — ^that an extension in phase space of the magnitude 

J ... J dqx — ^ ( 23 . 2 ) 

might be regarded as an approximate classical picture for a precisely 
defined quantum mechanical state. At the correspondence principle 
limit, where conditions prevail such that either classical or quantum 
mechanical methods can be applied, we may then expect the principle 
of equal a priori probabilities in the phase space to imply — ^at least for 
those conditions — ^a principle of equal a priori probabilities for different 



62 STATISTICAI/ ENSEMBLES IN THE CLASSICAL MECHANICS Chnp. Ill 

quantum meoliamoal states. And we shall indeed find when wc come 
to the development of quantum statistics that our postulatory Imais 
will contain an hypothesis which will assure this agreement. 

24. System of interest and representative ensemble 

Accepting the postulate as to equal a priori probabilities Jbr e<iuiil 
regions in the phase space, we are now provided with principles i’or 
selecting an appropriate ensemble to represent an individual system in 
a partially specified state, or vice versa of determining what knowledge 
as to the condition of an individual system is represented by a given 
ensemble. The essence of the relationship between represontotive en- 
semble and system of interest is that the distribution of the inettilan's 
of the ensemble over different states agrees with what is known as lo 
the actual state of the system of interest but is otherwise uniform in 
the phase space in accordance with the hypothesis as to e<pial d priori 
probabilities. 

With the help of this connexion we can then use the appropriate 
representative ensemble for making estimates as to the properties anti 
behaviour of the corresponding system of interest itself. To do this wo 
take the fraction of all the systems in the representative ensemble in 
any range of precise states as equal to the probability of finding the 
system itself, on examination, to be in that range of states; and wo take 
the average value for any property for all the systems of the ensemble 
as equal to the average value of that property for the system itsedf. 
By the probability of finding the system in any given condition, wo 
mean the fractional number of times it would actually Iks found in t hat 
condition on repeated trials of the same experiment. And by the 
average value of any property of the system we mean the average t hat 
would actually be found on such repeated trials. As averages wo may, 
of course, taJke means, most probable values, or other forms of average, 
as seems convenient or appropriate. Wo may then use stioh averages as 
fur ni s hin g estimates for the properties of the system itself. 

The above may also be expressed in more specific mathcmatiiial form. 
If we consider a condition of the system that corresponds to a group 
of states lying in a given region of the phase space M, wo can take the 
probability P of finding the system in that condition as given by 

■P = ;^ J ••• J P{S>P) dqi...dp,, (24,1) 

where N is the total number of systems in the representative onscinblo 



§24 


SYSTEM AND ENSEMBLE 


63 


and p{q,f) is their density of distribution. If we consider two such 
conditions, corresponding to regions and R^, we can take the ratio 
of probabilities of finding the system in these conditions as given by 

(24.2) 

n. 

And if, for example, we are interested in the mean as the average for 
any property of the system, which is a function F{q,p) of the coordinates 
and momenta, we can use therefor our previous formula (18.6) for the 
mean over all the members of the ensemble 

J - J (2^-3) 

In concluding this section on the relation between system of interest 
and corresponding representative ensemble, it may be noted that sys- 
tems which are themselves in a steady condition of macroscopic equi- 
librium will be correlated with ensembles, which are in statistical 
equilibrium, and which hence exhibit constant average properties. On 
the other hand, systems whose macroscopic properties are changing 
with time will be correlated with ensembles, which are not in statistical 
equilibrium, and which hence give predictions as to average properties 
that change with time. In the next chapter we shall consider systems 
and ensembles in steady conditions, and in following chapters systems 
and ensembles that change with time. 

25. Validity of statistical mechanics 

It has been made clear by the foregoing that statistical methods can 
be employed in a very natural manner for predicting the properties 
and behaviour of any given system of interest in a partially specified 
state, by the procedure of taking averages in an appropriately chosen 
ensemble of similar systems as giving reasonable estimates for quantities 
pertaining to the actual system. We may now conclude the present 
chapter with some remarks concerning the point of view as to the 
validity of these methods which is adopted for the purposes of the 
present book. 

In the first place, it is to be emphasized, in accordance with the 
viewpoint here chosen, that the proposed methods are to be regarded 
as really statistical in character, and that the results which they provide 
are to be regarded as true on the average for the systems in an appro- 


Px _ 

■Pa \ \ p{q,p)dq^...dpf 



64 STATISTICAL ENSEMBLES IN THE CLASSICAL MK.CHANTCS C’linp. Ill 

priately chosen ensemble, rather than as necessarily j)ro(riscIy (rue in 
any individual case. In the second place, it is to bo oin|)liasi7,e<l (hat 
the representative ensembles chosen as appropriate are to be con- 
structed with the help of an hypothesis, as to etpial a priori probabilii ies, 
which is introduced at the start, without 2)roqf , as a necessary pos( ula((*. 

Concerning the first of these apparent limitations, it is to be retnai-ktal 
that we have, of course, no just grounds for objecting to the la<-(, ( hat. 
our methods provide us with average rather than precise results. 'Phis 
is merely an inevitable consequence of the statistical nature of our 
attack, and we have committed ourselves to statistical rather than 
precise methods, either because we are forced thereto by latjk of iireciscs 
ini tifl,! knowledge or because the practical ju-oblenis which we have in 
mind are otherwise too complicated for treatment. Moreover, it. is to 
be noted that the proposed methods make it })f)ssiblc to compute not. 
only the average values of quantities but also the avt'rnge Jimiuntiont* 
around those values. This, then, makes it ])ossihle t.o draw cmmlusions 
also as to the frequency with which we may e.^pecft to find syst<‘nis 
with properties differing from the average to any spe<!iiied extent. In 
the case of typical applications the comi)utcd fluctuations art* ext nmiely 
small. In the special cases where they aro largo enough th<*y may be 
compared with what is found experimentally. 

Concerning the second of tho abovo-mentionod limit at ions on the 
character of the proposed methods, two remarks already nmd<> in 
the preceding section may again be emphasized. In tho first, phice, it, 
is to be appreciated that some postulate as to tho a priori [trobabilit ies 
for different regions in the phase space has in any <aist< to Ik» chosen. 
This again is merely a consequence of our commitment to stat ist itad 
methods. It is analogous to the necessity of making some preliminary 
assumption as to the probabilities for heads or tails in onler tcj pi*edi<it. 
the results to be expected on flipping a coin. In tho howjikI pia«>«s it. 
is to be emphasized that the actual assumption, of e<(ual o priori 
probabilities for different regions of equal extent in tins phust? spacif, 
is the only general hypothesis that can reasonably bo choscai. With the 
help of Liouville’s theorem, it has been shown that tho principles of 
mechanics do not themselves contain any tondonisy for phase points to 
crowd into one region in the phase space rather than amjthcr; and 
hence, in the absence of any knowledge except that our systoms do 
obey the laws of mechanics, it would be arbitrary to make any assump- 
tion other than that of equal a priori probabilities for different regions 
of equal extent in the phase space. The procedure may be regarded as 



§26 


VALIDITY OF STATISTICAL MECHANICS 


65 


roughly analogous to the assumption of equal probabilities for beads 
and tads, after a preliminary investigation has shown that the coin has 
not been ‘loaded’. 

In further support of the validity of the proposed methods it may, 
of course, again be emphasized that they have the a posteriori justi- 
fication of leading to conclusions which do agree with empirical facts. 
This includes agreement with conclusions not only as to average values 
but also as to fiuctuations. 

Hence the present point of view as to the validity of the methods of 
statistical mechanics may be summarized as follows. The methods are 
essentially statistical in character and only purport to give results that 
may be expected on the average rather than precisely expected for any 
particular system. The methods lead to calculated fiuctuations around 
the averages which are exceedingly small in the case of the usual tjTpical 
applications, and in other oases can be compared with empirical find- 
ings. The methods being statistical in character have to be based on 
some h3q)othesis as to a priori probabilities, and the hypothesis chosen 
is the only postulate that can be introduced without proceeding m an 
arbitrary manner. The methods lead to results which do agree with 
empirical findings.f 

In the course of the historical development of statistical mechanics, 
the above point of view as to the validity of its methods was not the 
one ultimately adopted by Maxwell and Boltzmann. With the help of 
a diSerent assumption, rather than our own bald postulate as to equal 
a priori probabilities, it was hoped to justify the methods of statistical 
mechanics by showing that the time merctge of any quantity pertauung 
to any single system of interest would actually agree with the ensenMe 
average for that quantity calculated by the methods of statistical 
mechanics for aU the members of the corresponding representative 
ensemble. The postulate leading to this conclusion was called by Boltz- 
mann the e/rgodic hypothesis, and by Maxwell the assumption of cotUinuily 
of pa^. It states that the phase point for any isolated system would 
pass in succession through every pomt compatible with the energy of the 
system before finally returning to its original position in the phase space. 

t The above point of view does not conflict with that of Gibbs. He was. much less 
explicit, however, concerning the hypothesis of equal a priori probabilities for equal 
regions in the phase space, and much less confldent about the validity of statistical 
mechanics, presumably because of its failure to account for phenomena — such as specific 
heats — ^whlch we now know must^e treated by the methods of quantum rather than 
classical statistical mechanics. The point of view appears in several ways sympathetic 
with that adopted by Fowler in his Statistical Mechanics, 



66 STATISTICAL ENSEMBLES IN THE CLASSICAL MKC’HANTCS riiai). HI 

We may first show that this ergodic hypothesis wotM lead to the 
above-stated simple relation between time averages and cnseinblo 
averages. To see this let us consider the behaviour of an isolated 
system of energy E and the behaviour of the corresponding micro- 
canonical ensemble of systems with phase points uniformly distributed 
in the shell between E and E+hE, where hE is regarded as going to 
zero. In accordance with the ergodic hypothesis all tho systems in this 
ensemble, and the system of interest itself, would have to exhibit 
exactly the same long time behaviour — as wo lot 81? go to zero -since 
each would have to pass in succession through all states compatible 
with the energy E. Hence, if we divide our phase space up into different 
dementary regions h, I, m,..., we could regard all systems in tho ensemble, 
and the system of interest itself, as spending the same fractions, 

Wi, of any very long time interval 0 to T in those flifferont 

regions. 

Taking N as the total number of systems in tho ensemble, wo could 
then write Nwy.T, NWfT, Nw^T,... as expressions for tho lotal time, 
during the interval 0 to T, which would bo spent by morabors of the 
ensemble in the regions specified. On the other hand, if wo denote by 
^k>. file numbers of systems at any time t in those regions, wo 

could also write j Nndt, j Nl it, j it,... for these same quantities, 
where we integrate over the time interval 0 to T. Equating, wo then 
have 

T 

Nw^T — J N^.dt, 

T 

Nw,T=zjNidt, (25.1) 

0 

0 


The q^tities W*, iVJ, W„,... are, however, independent of time sineo 
the microcanonical ensemble has been shown to bo one that remains 
in statistical equihbrium. This then gives us 



w,= 


N’ 



(26,2) 


This result then states that the firaction of the time during a long 
mterval 0 to T which a given conservative system of interest would 



§26 


VALIDITY OF STATISTICAL MECHANICS 


67 


spend in any region of the phase space, would be equal to the fraction 
of all the systems in the corresponding representative ensemble, which 
would be in that same region. The probability of findi-ng any given 
system of interest at a random instant of time in any specified state 
would then be equal to the probability of finding a system picked 
at random from the corresponding representative ensemble in that 
state, and the time average of any quantity pertaining to a single 
system of interest would be given by the corresponding ensemble 
average. 

Hence the older point of view, based on the ergodic hypothesis, might 
seem at first sight to furnish a more satisfactory justification for the 
methods of statistical mechanics than is furnished by the point of view 
which has been adopted in this book. In the first place, the older 
development was regarded as primarily based on an hypothesis which 
might be an actual consequence of the eaxict laws of mechanics, while 
our development is based on a definite postulate as to a priori prob- 
abilities, which can only be justified if its consequences do correspond 
with statistical findings. In the second place, the older development 
claims that every individual system will exhibit time averages that 
agree with ensemble averages, while our development only assumes that 
the averages obtained on successive trials of the same experiment will 
agree with ensemble averages, thus permitting any particular individual 
system to exhibit a behaviour in time very different from the average. 
Nevertheless, it now seems clear, firstly, that the ergodic hypothesis 
cannot be strictly maintained in its original form, and, secondly, that 
its employment even as a working hypothesis would deny to statistical 
ensembles their full function of representing the relative probabilities 
for all the different kinds of states, including the unusual ones, which 
might be exhibited by a given system of interest. 

With regard to the possible truth of the ergodic hypothesis itself, it 
is evident — except for the trivial one-dimensional case — ^that a mechani- 
cal system could not possibly obey this hypothesis in the original strict 
form which would require the phase point to pass precisely through 
e’^^ery point in the phase space compatible with the system’s energy. 
For a system of / degrees of freedom there would be 2/— 1 quantities 
besides the energy which would be arbitrary constants of the motion, 
and in effect only one of these would be ‘used up* in specifying the 
im’tia.l position of the phase point along any particular trajectory. 
Hence the phase point for a system with one assignment of constant 
values for the remaining 2/— 2 quantities could certainly not pass 



68 STATISTICAL ENSEMBLES IN THE CLASSICAL MECHANICS Chap. HI 

through points in the phase space corresponding to any other assign- 
ment even though these points did correspond to the right energy. 

In the case of ‘simple’ quantities, which are constants of the motion, 
for example the components of linear and angular momentum for the 
system as a whole, we should presumably regard ourselves as knowing 
the values of these quantities for the actual system of interest, and 
hence the above limitation would have no effect on our considerations 
since we should then only use systems all having the same (e.g. zero) 
components of total momenta in constructing the appropriate re])ro- 
sentative ensemble. And in the case of constants of the motion, having 
a very complicated dependence on the coordinates and momenta, the 
limitation to a particular choice of values might usiially be such as to 
permit the phase point to visit substantially all i-egions of the ergodio 
surface even though it could not pass precisely through every point. 
Nevertheless, it is easy to think of cases where the departure from the 
consequences of the ergodic hypothesis would be very great. 'I'hus, to 
take a system of the kind usual in classical statistical applicat ions, let 
us consider a gas, contained in an exactly rectangular box, and com- 
posed of rigid particles moving back and forth perpendicular to a pair 
of parallel faces in such a way as to avoid collisions. Hintui this to-aiid- 
£ro motion would be permanently maintained, it is evident that the 
phase point for this system would certainly not visit all signillcant 
regions in the phase space compatible with the energy, and that the 
time averages for quantities pertauung to the system (e.g. pressure «)n 
the walls) would be very different from the ensemble averages over the 
corresponding microcanonical distribution. 

The foregoing now also makes it evident why the use of the ergmlic 
hypothesis, even merely as a heuristic principle, would prevent the 
methods of statistical mechanics from exhibiting their full statistical 
character and usefulness. Since it is conceptually possible to have a 
system of interest permanently remain in a condition very diiferont 
from the average condition for all the members of the corresponding 
representative ensemble, it would be quite inappropriate to employ 
an hypothesis which would regard all the systems of the representative 
ensemble as being in a condition to give the same time averages for 
their properties. Bather we must wish our representative ensumbies to 
be capable of furnishing predictions as to the frequency with which wo 
may expect to find a system of interest in any kind of condition, 
including even such rare kinds as that given by the example of a gas 
composed of particles moving permanently back and forth between a 



§ 26 VALIDITY OF STATISTICAL MECHANICS 69 

pair of parallel faces. Moreover, from the point of view adopted in the 
present exposition, this complete function of our statistical methods 
is achieved by our employment of the hypothesis of equal a priori 
probabilities for equal regions in the phase space in the construction 
of representative ensembles. This procedure, which, as we have seen, 
is the only possible non-arbitrary one, does provide ensembles which 
would give an exceedingly small representation to conditions that we 
recognize as rare, and iudeed in general provides ensembles which give 
predictions that do accord with actual phenomena. 

A further unsatisfactory aspect of the original introduction of the 
ergodic hypothesis may be emphasized. Although the ergodio hypo- 
thesis, if true in its original form, would have secured an equality 
between the time averages for any individual system and the ensemble 
averages over the corresponding microcanonioal distribution, the cer- 
tainty of this result was only demonstrated for exceedingly long time 
intervals, and no assurance as to the pertinence of the result with 
respect to the short time intervals involved in actual experimentation 
was provided.f Thus even the ergodic hypothesis leads to the desired 
kind of prediction for a system of interest only when supplemented by 
the further essentially statistical assumption that the properties of a 
system at any one time could be successfully taken as their long time 
average. 

It must hence be felt that the point of view of the early founders 
did not give due recognition to the truly statistical character of the 
problems to be attacked, f Impressed by the exact character of the 
principles of classical mechanics, and also by the actual regularities in 
the macroscopic behaviour of systems composed of many molecules, 
they apparently hoped to secure really precise results for such systems 
by the temporary introduction of an hypothesis which might itself be 
ultimately validated from the principles of mechanics proper, and did 
not sufficiently appreciate that a further essentially statistical assump- 
tion would be needed even if their h 3 rpothesis were valid. As em- 
phasized m § 23, however, the treatment of systems which are not in 
precisely specified states nmst involve an actual extension of theory, 
which requires the introduction of an additional postidate not derivable 
from those included in the mechanics of precise states. 

Returning to the course of historical development, after it was 

t Compare with the similar view expressed by Fowler, StaHstvMl Meohcmica, second 
edition, Cambridge, 1936, §1.4. 

:|; This failure to adopt a truly statistical viewpoint is not to be ascribed to Gibbs. 



70 STATISTICAL ENSEMBLES IN THE CLASSICAL MECHANICS Chap. Ill 

appreciated that the ergodic hypothesis in its original form cotild not be 
strictly true, attention was turned to the study of the so-callod qmM- 
ergodic hypothesis in accordance with which the phase point would in 
iinift approach as near as desired to any selected point on the orgodio 
surface, and also to the study of the additional conditions which would 
be sufficient to secure desirable relations between time and ensemble 
averages.! These studies have recently been advanced by tlio work of 
von Neumann, Birkhoff, and othersj showing that there would Iki no 
inconsistency in the existence of a class of mechanical systems of a 
character such that there would be a great ‘tendency’ for individtial 
paths to exhibit time averages which would be ‘nearly’ the same as 
the ensemble averages over the corresponding microcanonical cnstunble. 
Nevertheless, it is evident that such studies can neither contradict the 
mechanical possibility for some paths, which would exhibit time aver- 
ages quite different &om the ensemble averages, nor eliminate the ulti- 
mate necessity for recourse to a postulate as to a priori probabilities. 

Krom the point of view adopted in the present b<;ok, l.ho hyptjt.heHis, 
of equal o priori probabilities for equal regions in the phase space, must, 
in any case be regarded as an essential element of statistii^al nutclutni(;H, 
which has to be introduced by postulation and which is then sufTicient 
to provide the statistical methods used. Without this postulatio tlicKs 
would be nothing to correspond to the circumstance that nature does 
not have any tendency to present us with systems in conditions which 
we regard as mechanically entirely possible but statistically improbable; 
and with the postulate the use of statistical meohanitis for the <leter- 
mination of averages and fluctuations then becomes merely a matter 
for computation. 

t The possibility for a quasi-orgodio hypothesis and tho nociWHity for ndiUlionni 
assumptions, to secure the desired relation botwoon time and oiiHc«nbl<i av»»rnKtw, wnrn 
first appreciated by P, and T. Ehrenfest, EnoyhL d. Math. WisH. IV. 2, ii, Hoft «, r.oi|v/ig. 
See footnotes 89a and 90. 

t See von Neumann, Proc. Nat. Acad. 18, 70 (1932); ibid. 18, 203 (1032); liirkholT, 
ibid. 17, 656 (1931); Birkhofi and Koopman, ibid. 18, 279 (1932). 



IV 

THE MAXWELL-BOLTZMANN DISTRIBUTION LAW 

26. The microcanonical ensemble as representing a system in 

equilibrium 

We axe now ready to commence the application of the classical 
statistical mechanics to the study of systems composed of large numbers 
of individual molecules. This is the kind of system for whose treat- 
ment the methods of statistical mechanics were originally especially 
devised. 

In the present chapter we shall be interested in the properties of 
such systems when they can be regarded, from a gross point of view 
which neglects the behaviour of individual molecules, as being in a 
steady condition of macroscopic equilibrium. These properties will be 
determined by the manner in which the molecules composing the system 
are then distributed among their own individual states, and our Tnain 
task in the present chapter will be a derivation of the so-caUed Maxwell- 
Boltzmann distribution law, which gives the distribution over different 
molecular states that can be expected to prevail when the system has 
come to macroscopic equilibrium. 

Owing to the comphoated character and, in practical situations, in- 
complete specification of state of a system composed of an enormous 
number of individual molecules, we shall wish to employ the methods 
of statistical mechanics in solving the above problem. We must hence 
first determine an appropriate statistical ensemble of systems, similar 
in structure to the one of actual interest, which can be used to represent 
the average properties to be expected for the system itself. In the case 
of a system which is given a definite fixed energy and which is enclosed 
in a fixed container in order to avoid the dissipation of matter, we shall 
find that a microcanonical distribution will give a suitable representa- 
tive ensemble. 

Consider a system composed of a large number of individual mole- 
cules, having the total energy E, and enclosed in a stationary container 
of fixed volume v. Let the walls of the container be of an idealized 
non-conducting character so that there will be no direct flow of energy 
through them as a result of molecular collisions. And let the container 
itself be sufficiently massive so that it can be regarded as approximately 
stationary with no appreciable transfer of kinetio energy between the 
system and the container when collisions take place. Furthermore, if 



72 


THE MAXWELL-BOLTZMANN DISTRIBUTION LAW (’Imp. IV 


any extemal fields of force act on the molecules, let tlioiu be c'on- 
servative in character and unchanging in time. 

The total energy of the system would then have a constant value 
independent of the time. On the other hand, the components of tota.! 
1inftn,r momentum for the system would be able to adjust themselves 
by interaction with the walls of the approximately stationary container, 
and the components of angular momentum would also be able to make 
such an adjustment, except for the special case of a container with 
walls which are ideally smooth and have the shape of a surface of 
revolution. 

Let us now assume that all we know about the mechanirsal state of 
the system of molecules is the constant value of its total energy K. In 
view of the partial character of this specification of precise state, wo 
must then apply the methods of statistical mechanics in order to obtain 
estimates as to the expected properties and behaviour of t he system. 
This we do by resorting to an appropriate representative ensemble of 
similar systems. 

In accordance with the relations between system of intenwt and 
representative ensemble as discussed in § 24, the phase points for the 

members of the representative ensemble are to bo taken in the first 

place — as distributed in a manner to agree with our partial knowledge 
of the actual state, and — ^in the second place — as otherwise distributed 
in as uniform a manner as possible in agreement with the principle of 
equal a priori probabilities for equal regions in the phase space. In the 
present case this can evidently be secured by taking a microcnnmical 
ensemble of systems, all having the structure and volume of interest, 
and having their representative points uniformly distributed over that 
portion of the phase space which lies in a narrow shell between the 
surfaces of constant energy E to E-^BE, that is, by taking 


p = const. (E to E+BE), 

P = 0 (outside the above range). 


(2«.l) 


As we let BE go to zero, we then fulfil the requirements that the dis- 
tribution should be left uniform ex<jept in so far as is needed to represent 
our partial knowledge of the state of the system, namely in this case 
its energy E. 

The particular characteristic of such a microcanonical ensemble, 
which we now wish to emphasize, is the fact that it has a distribution 
which is known (§22) to remain in statistical equilibrium. Thus the 
most probable or other kind of average values which it predicts for 



§26 


MICBOCANONIOAL ENSEMBLE 


73 


the properties of the system, of interest will themselves not be changing 
with the time. Hence a microcanonical ensemble may be regarded as 
representing a system which is itself — ^as far as we know — in a steady 
macroscopic condition when looked at from a point of view that 
neglects the behaviour of individual molecules. 

This important character of the microcanonical ensemble arises from 
the fact that it represents a system concerning whose state nothing 
but the energy is known, and a mere specihcation of energy does not 
lead to the expectation of any particular direction of change for the 
macroscopic properties of the system. Muctuations of those properties 
around' their most probable values may be expected, and would indeed 
be predicted from the microcanonical ensemble but, nevertheless, as 
taking place with equal frequency in any given direction and its reverse. 
The situation is usually quite different when we have a system for 
which other things besides the energy are known. For example, if we 
consider a gas known to have a given total energy, and also known at 
some given instant to be completely situated in one-half of the total 
container available to it, the corresponding representative ensemble 
would be such as to give a practically certain prediction that the gas 
would start at once in the direction of distributing itself more uniformly 
throughout the container. 

Instead of using the microcanonical ensemble, as we shall in the 
present chapter, for studying the properties of a system of molecules 
in a condition of equilibrium, it would also be possible to use the 
canonical ensemble. Such an ensemble also represents a system which 
is in a steady condition of macroscopic equilibrium, but one for which 
the energy instead of being precisely specified has a most probable 
value. In the classical statistics, as developed by Gibbs, such ca>nomcal 
ensembles were already found especially appropriate for stud 3 dng sys- 
tems having a precisely specified temperature rather than a precisely 
specified energy. In the quantum statistics, as developed later iu this 
book, we shall also find reasons for such use of the canonical ensemble. 
For our present purposes, however, we shall fibad the mlorocanonioal 
ensemble adequate. 

In conduding this section on the classical use of the microcanonical 
ensemble to represent a system of known energy in a macroscopically 
steady state, it is to be noted that the ensemble is such as to give equal 
probabilities for finding the representative point for the actual system 
of interest in all equal elements 8Vy of the corresponding phase space 
or y-space which are located within the volume available for the 

3695.25 



74 


THE MAXWELL-BOLTZMANN DISTRIBUTION LAW Chup. IV 

molecules and within the infinitesimal energy range A’ to K+hE. This 
is the result which we shall need in following sections in calculating tho 
probabilities for the different conditions of a system of molecules that 
will interest us. 

27. Specification of condition for a system of many similar 

molecules 

We may now turn to the problem of specifying <lifferonl, conditiims 
of interest for a system composed of many molecules. We sluill wish 
to specify these conditions in terms of the distribution of the molecules 
composing the system among their own possible states, and may ctjufine 
our attention at the start to systems containing molecules of a siijglo 
kind, since the later extension to molecules of more than one kind (see 
§ 30) win involve no additional principles. Wo may take n as the total 
number of similar molecules composing the system, an<l r ns the numlwr 
of degrees of fireedom for a single molecule of tho kind in (juestion. 
The total number of degrees of freedom for tho system ns a whole will 
then be / = nr. 

We may begin by considering tho specification of staf-c! for a single 
molecule in the system. For this purpose wo may use any tlosired sot 
of r conjugate coordinates and momenta 

(27.1) 

whose values would describe the state of tho molecule, e<»nMi<lered itself 
as a minute mechanical system. It will often be convenient to rt'gard 
the first three of the above coordinates, q^, q^, q^, as givittg tho position 
of the centre of gravity of the molecule with respect to a sfd. of (/artesian 
axes, which is taken stationary relative to tho container for the system, 
and regard the remaining coordinates as giving tho internal configura- 
tion of the molecule. Our set of coordinates and mometiUi for the 
individual molecule then becomes 

X, y, z, y* ... Pa,, Py, Pi ...Pr, (27.2) 

where x, y, z are the Cartesian coordinates for tho centre of gravity of 
the molecule, Pg,,Py,Pg are the corresponding components of linear 
momentum for the molecule as a unit, and the remaining coordinates 
and momenta refer to the internal configuration and motion of the 
molecule. 

Having chosen the kind of coordinates and momenta to be used for 
the individual molecules, it will then be convenient to introduce the 
idea of a (i-apace, provided with 2r rectilinear axes, one for each of 



§27 SPECIFICATION OP STATES OP MOLECULES 75 

the r coordinates and r momenta. The exact position of a representative 
point in such a ju,-space would then give an exact specification of the 
state of the molecule imder consideration. 

For our later statistical applications, however, we shall not be in- 
terested in the precise states of molecules but rather ia the various 
small ranges within which the values of their coordinates and momenta 
might fall. For this purpose we may now consider the /x-space for any 
molecule of interest as divided up into a collection of e^ual elementary 
regions, corresponding to different ranges 

= Sffi ... Sg,. ... Sp„ (27.3) 

aU having the same magnitude of extension. Such elementary regions 
are often spoken of as cdls in the jn-space. Thinking of these different 
cells as labelled by the integers, 1, 2, 3,..., i,..., we can then specify the 
state of any molecule, within the range desired, by giving the particular 
cell in the ft-spaoe inside of which the representative point for the 
molecule is taken as situated. 

We may next turn to the specification of the state of the system as 
a whole. For this purpose we shall regard the 2r coordinates and 
momenta, as assigned above to each of the individual molecules, as 
now furnishing a total collection of 2«r coordinates and momenta 
suitable for the system itself, t The state of the whole system can then 
be regarded as determined by the individual states of its n component 
molecules. Furthermore, by taking a /^-space of the kind discussed 
above for each of the different molecules and combining these individual 
spaces together, we can now construct a y-space or phase space suitable 
for the whole system. The exact position of a single representative 
point in this y-space will then give the exact location of each particular 
molecule in its own ju.-space, and hence give an exact specification of 
the state of the whole system. 

However, as already remarked above, we shall be interested for our 
later statistical applications in specifications of state which are not 
precise, but which correspond to ranges within which the coordinates 
and momenta for individual molecules might lie. For this purpose we 
may now regard the equal elementary regions, into which we have 

t This implies that w© can regard the coordinates and momenta of the molecules 
themselves as giving a sufOlciently precise specihcation of the intermolecular held, and 
can neglect the circumstance that, in general, a knowledge of additional variables would 
really be necessary to give an esiact specihcation to the complicated electromagnetic 
held existing between the molecules. This approximation is valid when the velocities 
of the electric charges involved are small compared with that of light. 



76 


THE MAXWELL-BOLTZMANN DISTKIBUTION LAW Chnp. IV 

divided the ju-space for a single molecule, as also providing a division 
of the whole y-space into equal demmtary regions, all having the same 
magnitude of extension, but corresponding to different, j-anges of tlu> 
form 

S'Wy = (Sffi-" 8Pr)l (^?l ”• (^S'l •••^Pr)k ••• (^t/l '^Pr)>n (27.4) 

where the subscripts 1, 2 ,..., k,..., n denote the particular moUuude to 
which the indicated range applies. Within the desired ranges of values 
for the coordinates and momenta of the component molooukw wo could 
then specify the state of the system as a whole by giving the particular 
region in the y-space, of the form (27.4), inside of which the repre- 
sentative point for the whole system is taken as situated. 

Having obtained this method of using elementary regions of tho 
form (27,4) for specifying the (approximate) state of our system, it is 
now to be specially noted that a number of such regions would, in 
general, correspond to the same condition of the system from the jw^int 
of view of its macroscopic properties and behaviour, since from that 
viewpoint it makes no difference which part icular molecul<‘s of t he n 
similar ones are taken as lying in specified cells of the /t-space. .Ifor 
example, if our system is specified by an elementary region having 
such a character that each of the n molecules is assigneti t.o a different 
cell in the p-space, it is then evident by consklering th<‘ iKwsihilities 
for permuting the molecules among tho different ocautpit'd <*clls that 
there would be a total of n\ different regions in tlus y-space ail corre- 
sponding to the same gross observational properties. 

For this reason we shall now finally be spotually ink*resfed in speci- 
fying what may be called the condition of our system by stating merely 
the numbers of molecules 


71]^, 72.2, 72.3,.. 


(27.r.) 


which are assigned to different cells i in the /x-spjice, without specifying 
just which molecules are used to provide the (iiiutas for those different 
cells. Taking any such condition of the system, it is then evkhuit that 
there would be a total of 


_ 72.1 

Tlj! Tlj! Tlji ... Ufl 


(27.0) 


different elementary regions Svy of the form given by (27,4), which 
would all correspond to this same specified condition. By changing 
ffom one such elementary region in the y-spaco to another wo change 
the state of the system — as specified within a range of Svy of tho form 
(27.4) but leave what we have called tho condition of tho system 



§27 SPECIFICATION OF CONDITION OF SYSTEM 77 

■unaltered.'}’ Such, changes from one Svy to another could be brought 
about by merely interchanging pairs of entirely similar molecules "which 
were originally assigned to different cells of the /x-space, and hence 
these changes would ha've no effect on the gross obser'vational pro- 
perties of the system. 

In this connexion it may be no'ted in passing that our procedure in 
regarding the interchange of two similar molecules as corresponding to 
a significant change in the mechanical state of a system, even though 
not in its condition, e'vidently implies the possibility of keeping a con- 
tinuous observation on the sys'tem which would let 'us know whe'ther 
two s imil ar molecules do change roles or not. This, however, is in entire 
agreement "with the point of view of the daasical mechanics, which would 
permit such a continuous observation, at least in principle, ■without any 
disturbing effect on the behaviour of the molecules. On the other hand, 
when we come to the quanl/um mechanics we must be prepared to find 
limitations on the idea of maintaining a precise observational control 
on the behaviour of molecules without introducing disturbances. Indeed 
we shall actually find in the quantum mechanics that we should lose 
all knowledge as to the distinction between two similar molecules, if 
they pass through an encounter sufficiently intimate so that it could 
not be described in terms of classical trajectories on account of the 
complementarity expressed by the Heisenberg uncertainty principle. 
This feature of the quantum mechanics proves essentially to be involved 
in the origin of certain striking differences between the classical and 
quantum statistics of systems containing colliding molecules, which 
can become very important under some circumstances, as we shall 
see later. 


t A reilaai’k may be desirable as to the terminology adopted in the present book. 
The word state — ^when without qualification — ^is used as corresponding to the most pre- 
cise specification of instantaneous condition theoretically possible. In the classical 
mechanics this moans precise values for all the 2/ coordinates and momenta for the 
system; in the quantum mechanics it means precise values for / instead of 2/ variables, 
as we shall see later. The word state — ^when qualified — ^for example, specified 
vnthm a range can be used to correspond to a less precise specification of condition. 
The word condition will be used in general corresponding to specifications of different 
degrees of precision. 

This is not the only terminology that might have been adopted. Some authors use 
the phrases ‘microscopic state’ and ‘macroscopic state’ to denote respectively what 
we have described above by ‘the state of the system as specified within a range 8Uy’ 
and ‘the condition as specified by the numbers of molecules n^ in the various cells i of 
the / 4 -space’. Such a terminology is not fortunate, however, since the word ‘micro- 
scopic ’ would suggest the greatest possible precision, and the word ‘macroscopic ’ would 
suggest primarily a lesser degree of precision rather than the neglect of distinctions 
between difierent similar molecules. 



78 


THE MAXWELL-BOLTZMANN DIRTRIBrTrON LAW Chnp. IV 


28. The probabilities for different conditions of the system 

We may now combine the results of the two preopding !ouj{ Hcctions. 
In the first of these sections, §26, it has been shown that a system in 
macroscopic equilibrium with a specified tot,al enerjfy can i»e appro- 
priately represented by a miorocanonical enseinhle, and that tins en- 
semble gives equal probabilities for all equal regions in t he y-Hpa(.‘e 
which correspond to the range of energies E to PJ -| 8 E eansidcre< 1 . In the 
second of the foregoing sections, § 27, it has been shown for a system of » 
similar molecules that there would bo a total of ?t!/(«|! Wjj! »a! ... w^l ...) 
different equal regions SVy in the y-spaco which would all correspond 

to the condition of the system specified by taking ?/,, n», m,.,... as 

the numbers of molecules assigned to different equal cells in t he /i-spaeo 
for the kind of molecule involved. Combining these two findings, wo 


can now write 


P=— - - ■ X const. 

%! n^\ W3! ... »q! ... 


(2H.1) 


for the probability P of finding tho system of inter(“st. in a condition 
specified by the numbers of molecules in the different «'ells i (»f the 
/i-space, where the factor denoted by ‘const.’ woukt hav<‘ the satiie 
value irrespective of the condition considered. 

Tor our further purposes it will be more convenient to tak«^ the 
logarithm of this probability, which can he expi'essed with the help of 
a summation over all cells i, in the form 


logP = log»!“- 2 log Mfl-1- const. (2H.2) 

% 


In problems of ordinary interest, moreover, not only will the total 
number of molecules n in the system bo ex(!eetlingly larg($, hut the 
numbers of molecules in most <jf tho (xsmpied tiells i can also Iss 
taken as large compared with unity. It then Iwitiomes possible to sim- 
plify (28.2) by introducing Stirling’s approximation for the facforials 
of large numbers: 

»i » 

(28 3) 

log n! » w log 71— n-f- 1 log 27 m. v- • / 

Doing so, noting the cancellation of tho two terms —n and -|- 
and dropping terms of order, log 7t, tho result (2H,2) then lea»ls to 

logP = TllogTl— ]^7l^log7).( l-const. (2H.4) 

as the finally desired expression for tho probability P of a coiwlition 
of the system specified by the numbers of molecules n{ in different 
equal cells i of the g.-space for the kind of molecule involved. 



( 79 ) 

29. Condition of maximum probability. Maxwell -Boltzmann 

distribution law 

With the help of the foregoing expression for the probabilities of 
different conditions of a system, composed of n similar molecules in 
macroscopic equilibrium with a specified energy E, we shall now be 
interested in determining which condition of the kind specified will 
be the most probable one. To carry out the analysis, let us take the 
numbers of molecules in most of the occupied cells % as being large 
enough, not only to permit the foregoiog use of Stirling’s approxima- 
tion for their factorials, but also to justify us in treating the numbers 
themselves as continuous variables in applying the calculus of varia- 
tions. We can then use (28.4) as an appropriate expression for the 
probability P of any specified condition, and can take the variational 
equation ^ (log«,+i) 8», = 0 (29.1) 

i 

as a necessary condition for a maximum value of P. 

The above variations Sn^ in the numbers of molecules in the various 
states i cannot be carried out, however, in a completely arbitrary 
manner. In the first place, the total ntunber of molecules n cannot be 
altered, so that we must also have the subsidiary equation 

8» = Srii = 0. (29.2) 

i 

In the second place, the total energy of the system E has to remain 
within the mfinitesmM range which we have specified for our micro- 
canonical ensemble, so that we shall also have the subsidiary equation 

SP = 2eiS»i = 0, (29.3) 

i 

where is the rate of increase in the energy of the system per molecule 
introduced into the ith region of the /t-space for equilibrium values of 
the as will be further discussed below. (The variation 8E in (29.3) 
should, of course, not be confused with the infinitesimal range in energy 
for the ezxsemble.) 

Equations (29.1), (29.2), and (29.3) are thus three simultaneous expres- 
sions which must be satisfied in order to secure the desired conditional 
maximum of logP or of P itself. Oombming them by Lagrange’s 
method of undetermined multipliers, we obtain a single equation which 
can be written in the form 

2 (log %-t- oi+p€i) Snt = 0, (29.4) 

i 

where a and j8 are undetermined constants. Since the variations 



«0 THE MAXWELL-BOLTZMANN DTSTBI BITTFON LAW C’hap. IV 

can now be treated as arbitrary, this then leads for each region in the 
u-space to the equation 

log«<+a+i3ei - 0. (2n.r») 

Solving for n^, we then finally obtain 

m = ( 20 . 6 ) 

as the desired expression for the most profxMc ininibor <»r inohn-ulos 
in the ith one of the equal ranges which wo iiavo sot np tor 

the coordinates and momenta of the kind of moloonlos tliat wo aro 
considering. This result, which gives the most probaldo distritnition of 
molecules among their own individual states for a syatom in a maoro- 
scopically steady condition, may be called tho Maxu'ell-UoUzmttnn 
distribution law. 

The quantities a and j8 occurring in (29.6) arc const ntits whoso values 
are related to the total number of molecules and f.o the energy (or 
temperature) of the system in a manner which w<? shall ('xaniino more 
closely in a later section. The quantity will bo soon from the. manner 
of its introduction in (29.3) to be tho rate of increase in the energy of 
the system, obi 

(29.7) 

o7b>i^ 

per molecule introduced into tho »th region of tlu? /f -sfan-e, when wo 
have the most probable distribution of molecules over t he- dilTerent 
regions. In the case of a sufficiently dilute gas, where tlio energy of 
interaction between the molecules can bo taken as negligible, ff will 
simply be the energy that would be assigned to a molecule having tho 
position, mtemal configuration, external and internal momentai corre- 
sponding to the ith range of coordinates and momenta Sqi... In tho 
case of more concentrated systems, where tho energy of a moloouio 
would also depend appreciably on interaction with its neighboiirs, tho 
interpretation of would be less simple. Wo shall Iks primarily in- 
terested m the case of dilute gases and shall usually treat simply as 
equal to the actual energy of a single molecule in tho ith coll Sji ... 
of the p-space.t 

The above derivation of the Maxwell-Boltzmann distribution law 
was obtained with the help of the Stirling approximation for tho 
factorials %! and with the added approximation involved in treating 

t In er^ositions usually given, it is taoitly assumed from the start that tho 
system is a highly dilute gas, and that can be treated in this manner. The more 
general treatm^t of ej, suggested above and previously by tho writer, might prove 
useful in studying more concentrated systems. 



§29 


DERIVATION OF THE LAW 


81 


the numbers Hi themselves as coutmuous variables. This was done in 
order to employ the usual methods of the calculus of variations as 
applied to continuous variables. The results obtained are practically 
exact, however, for all regions % where the numbers % ore large com- 
pared with unity. Hence the introduction of the approximations has 
no serious consequences in the treatment of systems of ordinary interest 
composed of many molecules. 

Other methods of studying the equilibrium distribution of molecules 
are open to us, depending on different possibilities of choice as to 
representative ensemble and computed average. Two methods which 
have received considerable attention may be briefly mentioned here. 

In one of these methods the mkvooomomcal disWUn/Awn, is retained as 
the representative ensemble, but instead of the most probable numbers 
the mem> numbers of molecules in the various cells i are calculated 
for the systems composing the ensemble. This procedure has the 
advantage that it no longer proves necessary to take the number of 
molecules in a given cell as large in order to perform the computation; 
nevertheless the actual calculation is found to involve approximations 
of its own, and the method is less expeditious than the historically 
familiar one presented above. The new averages — ^mean numbers in- 
stead of most probable numbers — ^are also found']' to obey the MaxweU- 
Boltzmann distribution law (29.6). 

In the other of the alternative methods to be mentioned, the canonical 
distribution instead of the microoanonical is taken as furnishing the 
representative ensemble, and the mean numbers of molecules in the 
various cells i are the averages calculated for the systems composing 
the ensemble. As already noted in §26, the change from a micro- 
canonical to a canonical ensemble corresponds to a change from a 
system of interest for which the energy is precisely specifled to one in 
which there is a spread in the values which might be found for that 
quantity. We may regard this last method of attack as the most satis- 
factory of all. Using this method we shall be able to show (see § 113) 
that calculations made -without introducing any approximations can 
lead precisely to a result of the form (29.6). 

As a final remark in connexion with the derivation, which we have 
given for the Maxwell-Boltzmann distribution law, it may be empha- 
sized for the case of large numbers of molecules that the Maxwell- 
Boltzmann can be shown to be not only the most probable distribution 
but also to be much more probable than other distributions which differ 

t See Fowler, Sialietioal Medhanios, second edition, Cambridge, 1936. 

359S,25 ]j; 



82 THE MAXWELL-BOLTZMANN DISTRIBUTION LAW Omp. IV 

appreciably therefrom. We shall leave the calculation of fltuituations, 
however, until we have developed the full apparatus of the (juantuni 
statistics. 

30. Maxwell-Boltzmann distribution for molecules of more 
than a single kind 

We must next consider the form of the equilibrium distrihuiion law 
for a system composed of interacting but not chemically r<?a<'ting mole- 
cules of more than a single kind. For the total numbers of moletuiles 
of the different kinds present, we may now use the symbols n, n’, n", 
etc., and may specify a given condition of the system by taking n], 
etc., as being the numbers of molecules of thost) difbu’ont kimls 
which are assigned to the various equal cells i, j, k, etc., into which 
we divide the /^-spaces for these different kinds of moh‘<-nleH. 

In accordance with methods of treatment similar to t.hoso previously 
developed, it is now evident that we can take the probability /' for 
any such condition of our system, when it has <!omc to nnuToscopic 
equilibrium with a definite energy E, as given by 

logP = »logn— ni\ogni-\-n'\ogn'— ^ n] logwj 4- 

+n"logn''-- 2«*logn*-l-..,-f-t‘onHt. {30.1) 

k 

As the equations determining the most probable condition of the system 
under the imposed requirements as to total nuinlau's of mok’cides ami 
total energy, we shall then have 

SlogP = — (logni+l)8«.f— (lognJ-i-l)Sw.j— {logn* l)8wS;--.,. 


0, 

Sn = ] 

E 8n< = 0, 

(30.2) 


8»' = : 

o 

li 

(30.3) 


8n'==;; 

1 hnl = 0, 



and 8E — 8% + 8w.j + ^ 8njJ -j- .. . o. (30.4) 

Combining once more by the method of undetermined multipliers, this 
gives us as the single equation which must be satisfied 

^ (log%+a+^€4)8»,i ^ (log«j4.a'-l-^€y)Snj + 

+ I (log^+a'+M) SnM-... 0, (30.5) 



§30 MOLECULES OF MORE THAN ONE KIND 83 

where a, a', a",... and j8 are undetermined constants. Since the varia- 
tions can now be treated as independent, we then have 

Ui = 

(30.6) 

nl = 


as the desired expressions for the most probable numbers of molecules 
of the different kinds m the different cells of their ju-spaces, when 
macroscopic equilibrium prevails. 

It will be noted that each kind of molecule will be distributed in 
accordance with an expression having the form of the original Maxwell- 
Boltzmann distribution law (29.6). The constants a, a', a",... will, in 
general, be different for the different kinds of molecules, and will have 
values determined by the total number of molecules of the kind in 
question, as we shall examine in more detail presently. The con- 
stant j8 will be the same for the different kinds of molecules, and will 
be related to the temperature of the system, as we shall also see 
presently. For sufficiently dilute systems the quantities e^, ej, e*,... will 
be the energies to be ascribed to a single molecule in the indicated cells 
i, j, k,... into which we have divided the /i-spaces for the different kinds 
of molecule involved. 

31. Re-expression of Maxwell-Boltzmann law in differential 

form 

The expression for the Maxwell-Boltzmann distribution law 

m (31.1) 

gives the equilibrium number of molecules % in the ith equal region 
Sji ... Sp/ into which we have divided the ja-space for the kind of molecule 
under consideration. It will now be natural to re-express this in a form 
which recognizes the dependence of the number of molecules in any 
region of the /t-space on the extension of that region and on the total 
number of moleordes available. 

To do this we may introduce a new constant O defined in terms of 
our previous constant a by the expression 

e-“ = nO Sgi ... 8p„ (31.2) 

where n is the total number of molecules and Sji ... Sp^ is the extension 
in the /a-space of the different equal cells into which we have divided 
that space. Substituting (31.2) into (31.1), and using the symbol 8n to 



84 


THE MAXWKLl-BOLTZMANN DISTRIBUTION LAW rimp. IV 


denote the number of molecules in the elementary cell, we can then 
rewrite the Maxwell-Boltzmann distribution law in the more familiar 

form ,,, nCe-P^Bqi ... S^- (31.3) 


This may now be interpreted as a differential form of e.spre.s.sion for 
the Maxwell-Boltzmann distribution law. It makes ihe number of 
molecules 8« in any region of the /t-spaoe proportional to the extension 
Sffi — SPr of that region, and also indicates the dopendeneo of Sn on the 
energy e of a molecule in that region, whore this quantity is now to be 


treated as a function 


€ = €(?1 ... Pr ) 


(31.4) 


of the coordinates and momenta of the molecule. 


32. Evaluation of constants in the Maxwell-Boltzmann distribu - 

tion law 

(a) Value of constant a (or C). With the help of t he foregoing dif- 
ferential form of exqpression for the Maxwoll-Boltzmnnn distribuHon 
law we can now investigate the constants which it contains. 1'he con- 
sideration of the constant a, or more conveniently the constant O with 
which we have replaced it, is very simple. 

Starting with the Maxwell-Boltzmann law 

8n — nOe-P^ ... (32. 1) 

and considering an integration over aU possible values of the coord inaleis 
and momenta for a molecule, we must evidently obtain the total number 
of molecules n. This then gives us 

n~nO j ... j e-P* dqi ... dp^. (32.2) 

And we hence obtain the result 

= - -J (32.3) 

J ... J e-P* dqi ... dp, 

for the constant 0 itself. 

The integration in this expression is to be carried out over all possible 
values of the coordinates and momenta ... p,. In principle tliis can, 
of course, always be done when we know the functional dependence of 
the energy e of a molecule on its coordinates and momenta. In the 
case of systems composed of more than a single kind of molecule an 
expression of form similar to (32.3) will hold for each of tho constants 
O, O', C,... which can be correlated with the constants a, a', a",... 
oocurring in the original Maxwell-Boltzmann expressions (30.6) for each 
kind of molecule. 



$32 


VALUE OF CONSTANT 0 


85 


(6) Introduction of the idea of a perfect gas thermometer. In order 
to proceed to an understanding of the other constant jS, which occurs 
in the Maxwell-Boltzmami expression, it will now be convenient to 
regard our system of interest as including as one of its parts a vessel 
of volume v, containing a dilute monatomic gas, composed of n simple 
particles of mass m. Such a gas, if sufficiently dilute, would obey the 
known laws for a perfect gas, and the adjunct to our system that has 
thus been introduced may itself be spoken of as a perfect gas thermo- 
meter. We shall now show that it is possible to relate the constant 
to the temperature T of this thermometer. 

To obtain the desired relation we must first consider the Maxwell- 
Boltzmann distribution for the n particles of mass m that compose the 
contents of the thermometer. Introducing Cartesian coordinates x, y, z 
for these particles, together with the corresponding momenta mx, my, 
mz, the distribution law (32.1) may be written in the form 

8« = nGm^e~^^ Sx8ySz8x8ySz, (32.4) 

where 8» gives the number of particles at equilibrium which may be 
expected to have coordinates in the range 8a:Sy8g and components of 
velocity in the range SxSySL Integrating the coordinates over the total 
volume V of the container, we may then write 

8n. = nCvm^e-P^BxhySz, 

01 ", denoting the product of the constant factors C'wre® by the single 
letter^, Bn = nAe-P* BxS^Sz, (32.6) 

for the number of particles having components of velocity in the 
indicated range. 

We may now use this result to obtain an expression for the pressure 
p of the gas. Consider an element 88 on the surface of the container, 
taken for convenience as lying perpendicular to the £c-axis in such a way 
that it is bombarded by particles travelling towards it in the positive 
cu-direction; and consider now the number of those particles, having 
a component of velocity between x and which would collide with 
this element of surface 88 in unit time. Making use of (32,6), and noting 
that x88 would give the volume from which such particles could come 
in unit time, we can evidently take 


x88 

V 


+ 00 +00 

/J 


nAe-^^ Bscdydi 



86 


THE MAXWELL-BOLTZMANN DIKTRIIUtTION I,A\V ('Imp. IV 


as an expression for the number of such coliisions, where the integra- 
tions are taken over all possible values from minus to plus infinity for 
the unspecified components of velocity y and i. Furt hermore, we can 
atari evidently take as an expression for the imunont um wiiich would 

be transferred to the surface by the impact of cneli sueli particle. Heneo 
for the total force acting on the element of surface we can now write 


+ 00 + 00 + 00 


p8;S = 2^j J J nAe-P*mx^ 



which, by changing the limits of integration for gives us 



+ 00 + 00 + 00 

J/J 


— 00 — CO - 00 


nAe~P*mic^ dMi/ffi 




as a preliminary expression for the pressure p of t he gas. 

To evaluate the integral occurring in (32.(i) let us now ref urn to our 
original expression (32.6) for the number of particles lying in tlu^ velo- 
city range 8i8y8z. By integrating this over all po.ssible values of the 
components of velocity ±, y, z we shall obtain an e.xprcssion for the 
total number of particles in the container, which «'an evidently lie 
written in the successive forms 


+ 00 + 00 + 00 

nAe-^* dxdydi 

— 00 —00 —00 
+ 00 + 00 + 00 

-CD —00 —00 

— 00 — 00 ^ 

+ 00 + 00 + 00 

+ J J J (IMydz 

— 00 — 00 — 00 

+ 00 + 00 + 00 

~ ^ / J dxdijdi, (32.7) 

— 00 — 00 — 00 



»= / / / 


where the second form of writing is obtained by snbstitniiiig for the 
energy of a particle of mass m its known form of dependoneo on velo- 
city, the third form of writing comes from performing a partial integra- 
tion with respect to x, and the last form comes from the conHicleration 
that the term between the limits x = +oo and x — oo is soon to go 
to zero since ^ must in any case be a positive quantity in order that 



§32 


87 


VALUE OF CONSTANT jS 


the total energy of the system be finite. The final form of expression 
in (32.7) then gives the desired evaluation for the integral occurring in 
(32.6), showing indeed that it has the simple value nj^. 

By combining (32.7) with (32.6) we now obtain 

y = i (32.8) 

as the desired statistical mechanical expression for the pressure of the 
gas in terms of the number of particles n, total volume v, and the con- 
stant of interest j8. On the other hand, since the gas by hypothesis is 
sufficiently dilute to behave as perfect, we can also write 




nhT 

V ’ 


(32.9) 


where & is a constant, as a phenomenological expression for the pressure 
of the gas in terms of its temperature T. Equating the two expres- 
sions for pressure given by (32.8) and (32.9), we now obtain the desired 


relation , 

ft = 

^ hT 


(32.10) 


between the constant )3 appearing in the MaxweU-Boltzmann expression 
for the particles in the perfect gas thermometer and the temperature T 
of that gas. 

(c) Value of constant /3. The relation given by (32.10) applies in the 
first instance to the n particles which form the contents of the perfect 
gas thermometer which we have regarded as in contact with or as a part 
of our actual system of interest. Nevertheless, assuming appropriate 
thermal interaction so that energy transfer can readily take place, it is 
evident from the fact that we are considering equilibrium conditions 
that the temperature T would be the same in the rest of the system 
as in the thermometer, and it is evident from the considerations of § 30 
that the same value of j8 would appear in the MaxweU-Boltzmann 
expressions for any class of molecules in the system as m the expression 
for the particles in the thermometer. Hence we may now take the 


relation i 


(32.11) 


as applying in general to any system obeying the MaxweU-Boltzmaim 
distribution law. 


Equation (32.11) may be regarded as providing a connexion between 
the purely statistical mechanical quantity j8 and the empirically defined 
temperature T. Since temperature is a quantity with which we already 



88 the MAXWELL-BOLTZMANN DISTHIBUTtOX LAW TV 

have fainiliarity, the use of (32.11) to replace the quantity fi in statistical 
meohanioal equations by l/kT is of real advantage in giving an under- 
standing of statistical mechanical results. At a later stage of our con- 
siderations (see §§ 122 and 131) we shall present what may .seem (o he 
a more fundamental method of introducing the thermodynami<- nofion 
of temperature into statistical mechanical eon-siderations. 

The quantity k occurriirg in the relation between ^ an<l 7' will l)e 
seen from (32.9) to be the ordinary gas constant Ji per nml of gas 
divided by Avogadro’s number that is, by tho numlau’ of mokuiulcs 
in a mol. This quantity is ordinarily called Boltzmann’s eonstant, and 
has the known valuef 

k = 1'380 X 10~“ ergs per degree centigrade. (32. 1 2) 

33. Useful forms of expression for the distribution law 

In the present section we may now consider tho expression of the 
Maxwell-Boltzmann distribution law in several dilTerent fonn.s whieli 
prove useful when applications are to be made. Detailed eonHi<h*ration 
of these applications will nevertheless have to be omit.ted in view of 
the character of the present book. 

Substituting the expression obtained in tho last sectitm for tho con- 
stant )8 into our previous expression (31.3) for the Maxwell Ihilt /.iuann 
law, we may begin by writing 

8w = ... 8p, (33. 1) 

as a general expression for the number of molecules at 0 ({uilibrium in 
any i n fi n itesimal range of coordinates and momenta Bqi ... where n 
is the total number of molecules and e is the energy contributed by 
a single molecule in the above range. In agreement with (32.3), tho 
constant G in the above expression will then have the value 

0 = -. — ^ , 

J ... J dqi ... dpr 

where the integrations are carried out over aU imssible values of tho 
r coordinates and r momenta for a molecule, ... q^iPi ...p^. The aetual 
evaluation of 0 will thus involve a knowledge of the energy per mole- 
cule € as a function of the coordinates and momenta, and the ixjsult 
obtained will depend on the temperature T of the system. 

Fot many purposes it is possible and convenient to take tho firat 

t Birge, Pkya. Bee. Supplement, 1, 1 (1929) ; modified to corroMpond to ti«> now value 
4*803 X 10“^° e.s.u. for e. 



§33 


VAKIOUS FORMS OP EXPRESSION 


89 


three coordinates q^, q^ appearing above as being the Cartesian 
coordinates x, y, z for the centre of gravity of a molecule vrith respect 
to axes fixed relative to the container for the system, and then take 
the momenta corresponding to these coordinates ae equal to mx, my, 
mz, where m is the mass of a molecule and x, y, z are its components 
of velocity. We may also assume the possibility of expressing the 
energy per molecule e as a sum of the potential energy €{x, y, z) associated 
with its position in any (constant) external field of force present, of the 
kinetic energy \m{x^-\-y^-\-z^) due to its velocity as a whole, and of 
the internal energy e{q ^ ... qr,p ^ ... p^) associated with the remaining co- 
ordinates and momenta 

e = e(x,y,z)+\m{x^-]ry^+z^)+e{q^ ... qr,p^ ...p^j. (33.3) 

Substituting into (33.1), we may then express the Maxwell-Boltzmann 
distribution law in the more explicit form 

_ tte,y,a)+im(ife*+9*+a »)+g(g,...gr.Pi...Pr) 

Zn = nCm?e X 

xSxSySzSq^ ... Sq,,SxSySz8p^ ... Spf. (33.4) 


With the help of this expression we may then investigate the distribu- 
tion of our molecules as a function of position, velocity, and internal 
condition. When e is separable as assumed, it vdll be noted that these 
three aspects of the distribution can be regarded as independent. 
Specifying other classes of variables in any desired way, the dis- 
tribution of molecules will be seen to decrease as we go to positions 
of higher potential energy, or to velocities corresponding to higher 
kmetic energy, or to internal variables corresponding to higher internal 


energy. 

The distribution over the different possible ranges of vdocity is often 
of special interest. To investigate this by itself we may regard the 
expression given by (33.4) as integrated over all possible values of 
the positional coordinates x, y, z and of the internal coordinates and 
momenta S'* ... ?r» 3^4 — fr- We shall then evidently obtain an expression 


of the form 


S% = ZxSyBi, 


(33.6) 


where A is now a new constant containing the results of the integrations 
which have been performed. This result, which gives the number of 
molecules having components of velocity in the indicated range SxSt/S^, 
may be called the MaxweU distribution law for vdocUies, in accordance 
with Maxwell’s original discovery of this specialized form of the more 
general MaxweU-Boltzmaim relation. 

S5M.26 v 



90 


THE MAXWELL-BOLTZMANN DISTBIBUTION BAW Chap. IV 


By integrating (33.6) over all possible velocities we readily obtain for 
the constant A the value 

1 

A = 


+ 00 +0O +00 


J / ! 


— 00 —CO —CO 



(see formulae of integration in Appendix 11). This then mak<« it possible 
to write the Maxwell distribution law in the following different forms 
which prove useful under different circumstances. 

For the number of molecules, having comjmmntH of vrhcUi/ .r, f/, s in 
a range &xSy8z, we have 

§71 = n\ 

For the number, having a speed c and a direction of moiwn gi\'cn by 
the polar angles 6 and <f > — flying in the range 80S(j58c, W(‘ have 

871 = 7 i/^^j^e-““*/ 8 fcJ' c2 sin 0 (33.H) 


771 


27TkT, 


8i8//S2. (33.7) 


For the number, having a speed c in the range 8c without, i-efcreiu-e to 
direction, we have 


StI = 47771 





(33.1)) 


And for the number, having a hineiic energy c lying in l.he rang«< Sc, 
we have 

StI = 27771 

The foregoing specific results often prove very useful. Tl’hey are 
particulaxly important in investigating the frequeney ami character of 
the collisions that can be expected to occur between the molecules 
of a gas. The results obtained find application in studying such physical 
questions as the viscosity, thermal conductivity, and diffusion of gjisos, 
and such chemical questions as the activation by collision <tf the mole- 
cules of a chemically reacting gas. 


P~]h- 

[nkT} 


*/*'’’«* Se. 


(33.10) 


34. Mean values obtained from the distribution law 

Since the Maxwell-Boltzmann distribution law gives information as to 
the numbers of molecules under equilibrium conditions which would 
have coordinates and momenta falling in any range of interest, wo can 



§34 


MEAN VALUES 


91 


use the distribution law to calculate the mean values of any functions 
of those variables. Thus if /{q^ ... ...p,) is a function of the co- 

ordinates and momenta of a single molecule, it is evident from the 
general form of the Maxwell-Boltzmann distribution law as given by 
(33.1) that the mean value of this quantity for the n molecules of the 
system would be given by 


f==j-jCnqt...Pr)e-^'^^dq^...dp, 


f-J 

[f(qi 

... Pr)e-^i^^ dqx...dpr 

J 

[...J 

[* g-e/ftr Aqi... dp f 


where the integrations are over all values of the coordinates and 
momenta q ^ ... and where in this section and the next we shall use 
a single bar to indicate a mean value taken over a distribution of 
molecules. A few illustrations of the application of this equation may 
be of interest. 

We have already seen (33.2) that the parameter G occurring in the 
Maxwell-Boltzmann distribution law has a value which could be com- 
puted — at least in principle — ^firom the equation 


G = 


1 

J ... j dqi...dpr‘ 


(34.2) 


The actual evaluation of the indicated integral over ah values of the 
coordinates and momenta may nevertheless be difficult to carry out. 
Hence it is of interest to note that we can in any case obtain a simple 
expression for the temperature coefficient of the parameter C. We 
achieve this by taking a logarithmic differentiation of the quantity C 
with respect to temperature T. This gives us 


dlogC^ J - J dq,...dpr 

J ... J dqj^ ... dpr 

since the only occurrence of T is the explicit one in the exponent. With 
the help of (34.1), we then see that our temperature coefficient has the 

dMple value (31.3) 

AT hT^ 


where i is the mean energy of the molecules under consideration. This 
result at times proves useful. 

It is also of interest to apply the method of calculating mean values 



02 


THE MAXWELL-BOLTZMANN DISTRIBUTION LAW Chap. IV 


given by (34,1) to a determination of the av&Toge speed c of the molecules 
of a gas. For this purpose we can make use of Maxwell’s equation for 
the distribution of speeds (33.9), where the needed integrations have 
already been performed over all the variables except the one of interest 
c. For the mean speed of the molecules in a gas under equilibrium 
conditions we then easily obtain 


“?) = 14,600 /5 Si 
^ wm / V sec. 

For the root-mean-sqvxvre speed we can similarly obtain 


(34.4) 



cm. 

sec. 


(34.5) 


And by considering the conditions for a maximum, instead of taking 
a mean, we can obtain for the most probable speed 




ora. 

sec. 


(34.«) 


The somewhat close agreement between these different kinds of average 
for molecular speed results from a fairly sharp convergence around the 
most probable value. The numerical valuesf are in centiraotres per 
second, with T in degrees centigrade absolute, and M the molecular 
weight of the gas. 

The mean energy of the molecules in a system is an average of great 
importance. For this we have in accordance with our general equation 
(34.1) the formal expression 


J ... J e-^l^^€dqi...dpf 

/ j dqi...dpr ’ 


which can be evaluated in principle for any known dependence of e on 
the q’a and p’s. Multiplying by the total number of molecules present 
we can then also obtain an expression for the total energy of the system, 
and by differentiating with respect to the temperature T we can obtain 
an expression for its heed capacity at constant volume. 

The treatment of mean energies is specially simplified when wo can 
express the energy of a molecule e as a sum of terms, one or more of 
which depend only on a single variable out of the total set of coordinates 
and momenta gi ... g,) Pi — Pr- example, it may be possible to 
express the energy e in the form 

e = e(j’i)+e(!?i - Pi - Pr)> (34.8) 

t Dushman, High Vacuum, Schenectady, 1922, 



§34 


EQUIPARTITION OF ENERGY 


93 


where the momentum occurs only in a single term. Using our general 
formula for mean values (34.1), it is then evident that the meaji energy 
associated with this particular variable could be calculated from the 
formula . 

J dp^ 




J e-«(p»)/fcr 


(34.9) 


where the variables other than pj^ have been integrated out. 

It frequently happens that the part of the energy under considera- 
tion has a quadratic dependerice on the variable involved, 

ebi) = bpl, (34.10) 


where 6 is a constant, as, for example, in the case of the kinetic energy 
p%l2m = mac2/2 associated with the sc-component of the velocity of a 
molecule of mass m. Substituting (34.10) into (34.9) and performing 
the indicated integrations, we then readily obtainf — ^with the limits of 
integration from minus to plus infinity — 

^ = IJcT. (34.11) 


In accordance Avith this finding the mean energy associated with each 
variable, which contributes a quadratic term to the total energy of the 
molecule, has the same value ^kT. This result is ordinarily spoken of 
as the principle of the equipartiUon of energy. It should be emphasized 
that the principle is a consequence of the assumed quadratic form of 
dependence on the variables involved rather than a general consequence 
of statistical mechanics. 

With the help of the principle of the equipartition of energy we can 
now easily obtain expressions for the totcH energy E and heai capacity 
at constant volume 0„ in certain simple cases where the principle would 
apply at least in first approximation. Tor a dilute monatomic gas com- 
posed of n simple particles we obtain 

E = ^kT and C„ = ^nk, (34.12) 

corresponding to the three component momenta that contribute quad- 
ratic terms to the kinetic energy of the molecule. Similarly, for the 
case of a diatomic gas composed of n molecules which are thought of 
as rigid rotators, we obtain 

E = ^kT and = ^nk, (34.13) 

corresponding to the two additional rotational energy terms that are 
now present. And, as a final example, for a crystal composed of n 
atoms, which are thought of as point particles, oscillating around their 
t See Appendix II for the necessary formulae of integration. 



94 THE MAXWBLL-BOLTZMANN DISTRIBUTION LAW Chap. IV 


equilibrium positions under the restraint of forces obeying Hooke’s law, 
we obtain E = ZnkT and G„=Znlc, (34.14) 

corresponding to the three quadratic terms for the kinetic energy and 
the three further quadratic terms for the potential energy of a particle. 

The foregoing simple consequences of the principle of equipartition 
of energy are approximately correct under appropriate circumstances, 
and are among the results of the classical statistical mechanics which 
have proved most illuminating for the development of physical- 
chemical ideas. The deviations from these simple results on account of 
the necessity of using relativistic mechanics in the case of high-velocity 
particles are of interest,! and deviations which arise in problems where 
we must use quantum instead of classical mechanics are very important 
as we shall see later in the present book. 

To conclude this section on mean values, we may also consider, in 
addition to the energy of a system, its components of linear and of 
angular momenta. Using Cartesian coordinates x, y, z corresponding 
to axes stationary with respect to the container for our system, and 
assuming the separation of the energy into the potential, kinetic, and 
internal energy of a molecule as in (33.3) and (33.4), wo readily obtain, 
by integrating out over the internal coordinates and momenta, 


J... Jmscc dxdydzdxdydz 

Pa ’W® - “ (34. 1. 5) 

J ... J e dxdydzdxdydz 

for the mem Umar momentum of the molecules parallel to the ;r-axiK, and 
^ = m(oi!y—yx) 

. . _ t(z.y .8.)+ 

J ... J m{xy—yx)e dxdydzdxdydz 

I . <(g.T/.g)+i OT(a t»~+^-ra‘') ■ " (.34.16) 


3EI’ , 


- -Tjj, 




(.34.16) 


dxdydzdxdydi 


for the mean anguhr momentum of the molecules around the s-axis. 
Examining the above expressions, however, we note that in each little 
spatial region dxd/ydz the components of velocity would bo distributed 
symmetrically around the value nought. Hence we obtain mean values 
per molecule and thus also total values of linear and angular momentum 
for the system as a whole which are zero: 

ij, = Jipj, = 0, = nmg = 0. (34.17) 

t See Jttttner. Ann. d. Phyaik. 34, 866 (1911); Tolman, Phil. Mag. 28, 683 (1914). 



§34 LINEAR AND ANGULAR MOMENTUM 96 

This result that the most probable values of the linear and angular 
momentum of the system will be zero, when referred to axes stationary 
in the container, is fundamentally a consequence of our original 
starting-point in which we set a definite value for the energy of the 
system but permitted exchange of linear and angular momenta through 
molecular impacts with the walls of our assumed massive container. 

35. The general principle of equipartition 
It has been emphasized in the preceding section that the principle 
of the equipartition of energy, which leads to the mean value ^TeT for 
the energy associated with a single coordinate or momentum ...p„ 
only holds in the case of a quadratic dependence of energy on the 
variable under consideration. For this reason it is of iuterest to show 
that a considerably more general form of equipartition principle can 
be derived,! which applies under wider circumstances and reduces to 
the equipartition of energy in the case of an actual quadratic de- 
pendence on the variable involved. 

To obtain the desired generalized principle, let us start with the 
Maxwell-Boltzmann distribution law in the general form 

8n = nGe-^^^ Sg-j ... Sp,., (35.1) 

where n is the total number of molecules and 8n is the number in the 
indicated range of coordinates and momenta ... Integrating 
over all possible values of the coordinates and momenta, and dividing 
through by n, we must then obtain the result 

J ... J dq^ ... dp, = 1. (36.2) 

In this expression let us then carry out a partial integration with 
respect to any one of the coordinates and momenta q^... p„ denoted for 
simplicity by q^ whether in actuality a coordinate or a momentum. We 
then obtain a result which can be written in the form 

J ... J ... '^r+j ... / 

(36.3) 

where a and b are respectively the lower and upper limits to the possible 
range of values for qi. 

In connexion with the above equation we usually encounter one or 
the other of two different types of situation. 

In the first type of situation the energy of a molecule e is independent 
t Tolman, Phys. Sev. 11, 261 (1918). 



96 


THE MAXWELL-BOLTZMANN DISTRIBUTION LAW Chap. IV 


of the variable tinder consideration. For example, the variable is 
a cyclic positional coordinate in the absence of any corresponding field 
of force. Under such circumstances the second term in (35.3) is equal 
to zero, since dejdqj^ is itself zero, and the equation reduces to tho 
obvious but unimportant result 

f ... f dq ^ ... dp, = 1. (3r..4) 

J •/I 

In the second type of situation, which is frequently encountered and 
is the one of actual interest, the first term in equation (35.3) proves to 
be equal to zero. This occurs very often because the energy e of the 
molecule goes to infinity at both limits a and b, thus leading to tho 
value zero at both those limits owing to the occurrence of e as a negative 
exponent. This is the case when g'l is a positional coordinate having 
associated potential energy which increases without limit for positive 
and negative values of qi, as in the case of a bound particle, or when 
g-i is a momentum having associated kinetic energy that goes to infinity 
as goes to plus and minus infinity. The first term in equation (35.3) 
can also turn out to be zero because is zero at one limit and e nifinity 
at the other. Under such circumstances the first term in (35.3) dis- 
appears and the equation leads to the result 

dqi ... dp, = TcT. (35.5) 

Comparing this result with our general expression (34.1) for tho 
calculation of mean values of functions of the coordinates and momenta 
of a molecule, we now obtain a general equipartition jyrinciple wliicJi can 
be written in the form 




and which applies to the indicated mean values for each coordinate q^, 
or momentum Pj that behaves at the limits to its range of values in 
the frequently encountered manner described above. 

When the energy of a molecule does depend quadratically on tho 
variable in question, so that the portion of the energy associated with 
the variable in question has the form 

H = aql or €j = bpj, (35.7) 

the general principle of equipartition as given by (35.6) is soon to 
reduce to the equipartition of energy 

= ^ = ikT. 


(35.8) 



§35 


GENERAL PRINCIPLE OF EQUIPARTITION 


97 


On the other hand, the general principle is valid even when we do 
not have this simple quadratic form of dependence. Thus, for example, 
in the case of a gas composed of particles moving with velocities {x, z) 
high enough compared with the velocity of light c so that relativistic 
effects must be considered, we have 

€ = c{pl+Pl+Pg+m^c^)^ (35.9) 

for the dependence of the energy of a particle e on its components of 
momentum 


where m is the rest mass of the particle and u = its total 

velocity. Applying the general equipartition principle (36.6), we then 
obtain as a relation between mean values 



= kT, 


or, by substituting from (35.9) and (35.10), 


mi* \ / mf' \ _ r , 

.^{i-u^jc^)) w-u^W)) - wa-«7c*)j ' 


(35.11) 


This consequence of the general equipartition principle is of considerable 
interest.f 

In the first place, we note that it gives immediate information as to 
the mean rate of transport of momentum by the molecules of the gas 
which leads at once to the familiar result 


P = 


nkT 

V 


(36.12) 


for the pressure of the gas even when the particles do have high 
velocities. 

In the second place, by adding together the individual equations 
given by (36.11), we obtain the result 


mu^ I 

\V(1— 


3kT. 


(35.13) 


We may compare this with the expression for mean kinetic energy 


€ = 


mc^ 



*f Compare Tolman, Phil. Mag. 28, 583 (1914). 
O 


(36.14) 


3695.25 



98 


THE MAXWELL-BOLTZMANN DISTRIBUTION LAW Chap. IV 


Expanding both expressions in series form and combining, wo easily 
obtain the result 


1 = 0«.ir„ 


We thus find on the one hand from (35.12) that our gas (iontaining 
high-velocity particles would obey the perfect gas laws, but on the 
other hand from (35.15) that its kinetic energy would bo great-cr tlian 
for an ordinary perfect gas. 

This must now conclude our chapter on the pre-quantum status of 
the Maxwell-Boltzmann distribution law for the molecules of a system 
which has come to a steady condition of macroscopic equilibrium. Wo 
have tried to give a real insight into the statistical principles involved 
in the derivation of this law, but have only endeavoured to give a rough 
indication of the nature of its applications. 



V 

COLLISIONS AS A MECHANISM OF CHANGE WITH TIME 
36. Introduction 

In the preceding chapter we have considered the application of 
statistical mechanics to systems which have come to a steady condition 
of equilibrium when looked at from a macroscopic point of view that 
neglects the behaviour of individual molecules. We now wish to com- 
mence the study of systems that are in a condition which is changing 
with time. In the present chapter we shall be concerned with the con- 
sideration of molecular collisions as an important mechanism involved 
in the changes with time that do take place. And in the following 
chapter we shall then complete the classical part of our study by 
deriving and considering Boltzmann’s famous H-theorem, which gives 
a fundamental kind of information as to the direction in which changes 
in condition with time may be expected to occur. 

Except in the specially simple case of collisions between spherically 
symmetrical molecules, a treatment of the classical theory of collisions, 
adequate for a derivation of the H-theorem, proves to be a somewhat 
complicated matter. This necessary complexity was appreciated by 
Boltzmann himself, but has not always been emphasized by later 
expositors who have tended to confine their demonstrations of the 
tendency for H to decrease with time to the case of systems composed 
of rigid, elastic spheres. In view of the elaboration actually needed, it 
will be desirable to give a brief outline of the treatment to be presented 
in this chapter. 

We begin the development in the next section, § 37, by first giving 
a preliminary discussion of two useful principles pertaming to the 
temporal behaviour of mechanical systems, which may be called the 
principles of dynamical reversibility and of dynamical reflectabiUty. In 
accordance with these principles, both the reverse and the mirror image 
of any possible motion for a dynamical system would also itself be a 
possible motion. We shall find these principles of use in our discussion 
of the different kinds of molecular collisions that can take place. We 
shall also be interested later in reconciling the reversibility possible for 
any mechanical behaviour with the irreversibilities actually observed 
in the behaviour of physical chemical systems. 

We then commence our detailed treatment of molecular collisions, 
in § 38, by considering an appropriate method of specifying the 



100 COLLISIONS AS MECHANISM OP CHANCE WITH TIME Chap. V 


different possible states i of a molecule when the effect of collisions is 
to be studied. This we do by leaving the coordinates q^, q^ ior the 
centre of gravity of the molecule undefined, and specifying the values 
for the remaining coordinates and the momenta q^ ... q^, p-y ... w'itliin 
an infinitesimal range This is followed by a useful classification 
correlating any state i of a molecule with possible identical states i, 
congruent states i' , reverse states — i, and enantiomorphous states i. 

In §§ 39 and 40 we can then classify the possible constellations ( j, i) 
for a pair of colliding molecules, and consider the different kinds of 

collisions ( I’ ! | that can take place in which a pair of molecules enter 
\Tc, I) 

the encounter in states i and j apd leave in states k and 1. For any 


chosen collision ^ “ possible to construct congruent collisions 




enantiomorphous collisions 




and the reverse (collision 


i-l 4- is also always possible to construct what is called the 

corresponding collision ^ in which the molecules enter the new 

collision in the states with which they left the originally (diosen (folli- 
sion. It is not in general possible, however, to construct an inverse 

collision l\’ ^1 such that the molecules would be returned frcuu their 
3! 


final back to their initial states. 

In §§41 and 42 we then demonstrate the general possibility of 
arranging aU possible collisions for a given pair of molecules in closed 
cycles containing a succession of collisions each of which (sorrcsponds 
to the preceding one in the list until we finally arrive at tlui first 
member with which we started. We also show for the cast! of sphcrituilly 
symmetrical molecules that such a closed cycle of corresponding colli- 
sions would reduce to only two members, each of which in this very 
special case would then be the inverse collision to the other. 

After completing this rather long account of formal consequonctts of 
the definitions selected for molecular states and collisions, we are then 
ready to apply the principles of mechanics to molecular collisions. In 
§43 we first call attention in passing to the results of applying the 
conservation laws for energy and for the components of linear and 
angular momentum to a collision between two molecules. In §44, 
returning to the line of argument needed to prepare for our later deriva- 



§36 


PRELIMINARY REMARKS 


101 


tion of the theorem, we then apply Liouville’s theorem, for the con- 
servation of extension in phase, to the behaviour of two molecules when 
they pass through a collision. This leads, after quite a long and com- 
plicated calculation, to an important relation (44.19) which connects 
the velocity of approach and range of variables defining the initiation 
of any collision with the velocity of recession and range of variables 
defining the completion of that collision. 

With the help of this consequence of the conservation of extension 
in phase, it then finally becomes possible in § 46 to apply the methods 
of statistical mechanics to the study of the probable frequencies with 
which different kinds of collisions would take place in a gas having 
a specified distribution over different molecular states. Considering any 

given collision together with its reverse collision ^ 

, it is found that the probable numbers 

of such collisions, taking place in unit time, would be proportional to the 
densities of molecular distribution for the pairs of initial states involved 
in the collision, with a proportionality constant or probability coefficient 
having just the same value for aU three of the related collisions. 

The chapter then closes in §46 with some remarks on the necessity 
of supplementing the study of classical bimolecular collisions by further 
considerations in order to obtain a complete picture of the mechanisms 
leading to the changes that take place with time in the condition of 
actual physical chemical systems. 

The two findings, first that all possible kinds of collision between 
two molecules can be arranged into closed cycles in which each collision 
corresponds to the preceding Collision in the list, and second that the 
probability coefficients for the frequencies of any given collision and 
the collision corresponding thereto have the same value, are the two 
final consequences of the present chapter which wiU be needed for the 
proof of Boltzmann’s .ET-theorem. The derivation of these findings is 
much simplified when attention is restricted to collisions between rigid, 
elastic spheres. The closed cycles of corresponding collisions then reduce 
to pairs of collisions which are the inverse of each other as shown in 
§ 42, and the elaborate discussion of the application of the principle of 
the conservation of extension in phase as given in § 44 becomes shorter 
and easier to follow. A consideration of this simplified case can be 
readily carried through if desired in order to obtain a valid idea of the 
general nature of the argument. \ 


its corresponding colHsionf^’^l 

V>yi 



102 COLLISIONS AS MECHANISM OP CHANCE WITH TIME Chap. V 


37. The principles of dynamical reversibility and reflectability 

Before proceeding to the detailed treatment of molecular collisions 
as a mechanism of change with time, we shall devote the present section 
to two general principles pertaining to the temporal heliaviour of 
mechanical systems. Taking for the moment the point of view of the 
special theory of relativity and considering different sets of ( lalilean 
coordinates which might be employed, it will bo appreciated that 
the equations of mechanics would be covariant to the translation or 
rotation of spatial axes, to the Lorentz transformation involving 
both spatial and temporal axes, and to inversions of the time axis 
or of any chosen spatial axis. The two possibilities of inverting 
the time axis or a chosen spatial axis may bo called the princiides 
of dynamical reversibility and reflectability. These principles will 
be sufficiently important for our work so that we can afford to 
give them detailed treatment. For our pur})OHes it will bo sufticiont 
to take the point of view of Newtonian rather than of rolativisf.ic 
mechanics. 

(a) The principle of dynamical reversibility. Wo may first consider 
a derivation of the principle of dynamical reversibility. For this pur- 
pose, let us consider an isolated, conserwative mechanical system of 
/ degrees of freedom, corresponding to the coordinates and let 

us take its behaviour as governed by Lagrange’s equations of motion 
(9.7) in the form 


d 8L 
dt 


3qi 


= 0 (i = l,2,3,...,/). 


(37.1) 


The system is taken for simplicity as isolated and conservative since 
we regard it as possible to look at any system from a point of view 
which would make this true, and the system is taken for simj)licity iis 
purely mechanical. The result obtained will apply under somewhat 
more general circumstances to systems, which may bo electromagnetits 
as well as mechanical in natui’e, and which may be acted on by external 
forces provided these are derivable from a potential or, if magnetic, are 
taken as produced by external currents which arc themselves reversed 
in direction when the reversed motion is considered. 

On account -of the conservative character of the system the Lagrau- 
gian function L will not contain the time t explicitly. In addition we 
can make use of the circumstance that the Lagrangian function for 
mechanical systems is known to be a quadratic or in any case an even 
function of the generalized velocities q. With these restrictions on the 



§37 


DYNAMICAL REVERSIBILITY 


103 


form of L, it is then evident that equations (37.1) could also be written 
in the form 

<'='■ 2 . 3 ...../). ( 37 . 2 ) 

Hence cither of the sets of equations (37.1) or (37.2) can be taken as 
giving a valid description of the possible motions of the system. 

The fact that the equations of motion for our system may be written 
in either of the forms (37.1) or (37.2) now makes it easy to draw the 
desired conclusion. Corresponding to any particular solution of equa- 
tions (37.1) which gives the coordinates as a function of time 

{i = h 2, 3,..., /), (37.3) 

it is evident that there will be a solution of equations (37.2) which can 
be written in the form 


(i = l,2,3 /). (37.4) 

Botli of these solutions, however, agree with possible motions of the 
system since we appreciate that they are both possible solutions of 
the Lagrangian equations for the system. 

To make the nature of the above result evident, let us — ^for the sake 
of speeificity-— designate the possible behaviour described by (37.3) as 
a forward motion of the system and the behaviour described by (37.4) 
as the corresponding reverse motion-, and let us consider two systems of 
the kind under consideration, 8 and S', one carrying out the forward 
motion and the other the reverse motion. At time < = 0 the configura- 
tions of the two systems 8 and S', as expressed by the values of their 
coordinates 


^^(O) = 24(0), 


(37.6) 


would be in agreement, in accordance with (37.3) and (37.4), and their 
motions, as expressed by the values of their generalized velocities 

2X0) = -m, (37.6) 

would be in reverse directions. At any later time t the configuration 
for the system S' would agree with that for system at an earlier 
time (-<), ( 37 , 7 ). 

and the motion would be in the reverse direction to that which pre- 
vailed for /S' at that earlier time. 


m) = -U-t)- ( 37 . 8 ) 

Thus we see, corresponding to any possible motion of a system of the 
kind mentioned, that there would be a possible reverse motion in which 



104 COLLISIONS AS MECHANISM OF CHANCE WITH TIME Clmp. V 


the same values of the coordinates would be reached in the reverse 
order with reversed values for the velocities. This is tlie content of the 
principle of d/ynamical reversibility. 

This principle, which is a direct consequence of the classical inctiha- 
nics, shows that any motion of an isolated mechanical system and its 
reverse are equally possible. We shall find this result of iin|X)rtance 
in understanding the different kinds of molecular collision whi<ih ca.n 
take place. We shall also, however, be interested in the primaple from 
a broader point of view, since we shall have to reconcile the reversibility 
in the microscopic behaviour of physical-chemical systems, which would 
be predicted by the above principle of mechanics, with the actual 
irreversibilities, which can be observed in their macroscopic behaviour. 
It will perhaps already be evident that the reconciliation will depend 
on recognizing that the predicted reversibility is an attribute oi' changes 
from one precisely defined state to another, wliich must bo freak'd by 
the methods of exact mechanics, while the obscrvoil irreversibilities are 
an attribute of changes from one less precisely specified c'ondition to 
another, which must be treated by the methods of statistical mec'hanics. 

(p) The principle of dynamical reflectability. In conne.xion with (ho 
above finding that any solution of the equations of mol.ion for an 
isolated conservative mechanical system remains a valid deH<'ription of 
possible behaviour after a change in the sign of the time, it is also 
of interest to show a similar possibility for changes in the sign of an 
appropriate group of the coordinates. For this purpose it will bo no<i(!s- 
sary to distinguish between different spatial directions, and Lagrange’s 
equations for our isolated conservative system may now be written in 


the form 


d 8L 8lj 

dt 8xi 8Xi 


d_^_3Jb 

dt 8yt 8yi 


( 37 . 1 )) 


0 

di 8Zi 8z .i ’ 

where the subscript i now refers to the different component parts of 
the system, and the coordinates aj^, y^, Zf are with respect to some set 
of Cartesian axes. If we now take the Lagrangian function L T— V 
as an even function of each of the velocities Xi, iji, Zj, since the kinetic 
energy T will now have this dependence, and also take L as an even 
function of the coordinate differences between parts, of the type 
(*<"*&)> ^i—yk)> since the potential energy V will depend on 



§37 


DYNAMICAL REFLECTABILITY 


106 


the distances between parts, it is evident that the above equations 
could also be written in the form 


dt dXi 8Xi 

dt dyi dy^ 
d dL dL 

dt8{-Zi) 8{-Zi) 


(37.10) 


where all the coordinates and velocities z^ parallel to one of the axes, 
in this case the z-axis, have been changed in sign. 

. With the help of these two forms in which the equations of motion 
for our system could be taken, we now see, writing 

= (37.11) 

as a possible particular solution of the equations of motion, that 

<=fM yi = gS), (37.12) 

would also be a possible solution, where z^ is the negative of zj for all 
the component parts i at all times t. Since a system 8 carrying out the 
motion (37.11) and a system S' carrying out the motion (37.12) would 
bear to each other the relation of mirror images as reflected at the 
a;y-plane, this result might be called the principle of dyncmical refieeta- 
bility. It will now be appreciated, of course, that the position and 
orientation of such a reflecting plane might be chosen arbitrarily, 
and we may now say that any motion of a system and any mirror 
image or enanliomorph of that motion would be equally possible. There 
will be a place in what follows (§42) where the principle of dynamical 
reflectability can be employed to advantage. 


38. Molecular states 

(a) Specification of molecular states appropriate for the consideration 
of collisions. We are now ready to prepare the way for the treatment 
of molecular collisions by considering an appropriate method for speci- 
fying the different states into which a molecule may be thrown as a 
result of collision with other molecules. 

In accordance with our previous procedure, we may regard the pre- 
cise state of a molecule of r degrees of fireedom as completely specified at 
any time of interest by the exact values of r coordinates and r momenta, 

...Pr- It is often convenient to choose the first three of these variables 

3596.25 p 



106 COLLISIONS AS MECHANISM OF CHANGE WITH TIM K Chap. V 

as Cartesian coordinates x, y, z for the centre of gravity of the molecule, 
so that the total collection will become 

x,y,z,qi ... - Pn (3«-l) 

where Py, and pg are the components of linear moment um for the 
molecule as a whole. 

However, for the purpose of treating the changes in molccailar con- 
dition that occur on collision, such a complete and precise Ki)ecificat.ion 
of state would usually be more than we desire. In the first ]ilace, wo 
shall for the most part not be interested in the spatial location a;, y, z 
of a molecule when it is changed by collision from one condition to 
another. In the second place, in agreement with the circumstance that 
we shall ultimately wish to employ the methods of statistical rather 
than of exact mechanics, we shall be interested in the specification of 
the other variables g'4...g', only to within a small range of values. For 
these reasons we shall now often wish to specify the state of a molecule 
only to the degree corresponding to an infinitesimal range So) of the form 

Sw = J ... J ... dp^, (38.2) 

where we regard the limi ts of integration for the variables <74... g,., 
Px>Pv>Pe> Pi—Pr close together in correspondence with the 

infinit esimal character of Scd. By assigning indices to designate different 

possible such regions, Soj, Scd^, Sco^ 8 &). j ,..., which wo may wish to 

consider, we can then take a molecular state os sufiicdently sijecified 
for many of our purposes by the index i which denotes some particular 
selected range 8a»f out of the various ones that interest us. 

(6) Classification of molecular states. For the treatment of molecular 
collisions, it will also be useful to have a classification of mole- 
cular states into what we may call identical, congruent, reverse, and 
enantiomoiphous states. In order to introduce this classification it is 
convenient to note that any particular specification of the internal 
coordinates and the external and internal momenta of a molecule, 
g* ...p„ may be regarded as corresponding to a geometrical figure which 
gives the configuration of the constituent parts of the molecule and also 
carries appropriately attached vectors to represent the magnitudes and 
directions of the velocities of those parts. The desired classification can 
then be made to depend on the possibilities for the conceptual super- 
position of such geometrical figures, regarding them as rigid but 
movable structures, as in procedures familiar in elementary geometry. 

Identical stcetes for a pair of similar molecules may then bo defined 
as those where it would be possible to secure superposition, in the above 



§38 


CLASSIFICATION OF STATES 


107 


sense, with the help of a simple translation of one of the molecules, 
leaving the configuration of its constituent parts and the directions of 
the vectors representing velocities unchanged, and not changing the 
relation of the molecules to any external field of force that might be 
acting. This definition of identity is thus chosen to agree with the 
circumstance that we shall not be specially interested in the mere 
spatial location when molecules change by collision firom one state to 
another. The possibility for the existence of identical states is obvious. 

Congruent states for a pair of molecules may be defined as those where 
a rotation as well as a translation might be needed to secure super- 
position. Congruent states thus become identical after a suitable rota- 
tion. In making such rotations the configuration, the relative directions 
of the vectors representing velocities, and the relation to external fields 
of force are all to be kept unaltered. Congruence for molecular physics 
is thus rather similar to congruence for ordinary geometry, provided 
we consider the additional circumstances that arise from the presence 
of vectors representing velocities and fields of force. The possible 
existence of congruent states is obvious from the invariance of the laws 
of mechanics to the translation and rotation of axes. 

Reveirse states for a pair of molecules may be defined as those which 
become identical, i.e. superposable by translation alone, after there 
has been a reversal in direction without change of magnitude in all 
the velocities for one of the molecules.'!' The idea of reverse states is 
thus a physical one for which there is no very immediate geometrical 
analogue. We shall find the idea of reverse states quite important in 
our consideration of molecular collisions. The possible existence of 
reverse states is guaranteed by the principle of dynamical reversibility. 

EimfiMomorpJuyus states for a pair of molecules may now finally be 
defined as those where one state would be identical 'with the mirror 
image of the other. The idea of enantiomorphous states is already 
familiar in organic chemistry, but is here extended to include the 
reflection of velocities and fields of force as well as of configuration. 
It is interesting to note that this pro'vides possibilities for truly enantio- 
morphic forms even in the absence of any asymmetric atom. There is 
really only one place in what follows where this idea as to enantio- 

t The definition, of reverse states here adopted differs from that in the author’s earlier 
book, StatisHcal Mechanics with Applications to Physios and Ohemistry, where reverse 
states were defined as becoming merely congruent rather than identical after the reversal 
of velocities for one of the pair of structures under consideration. The present definition 
is less likely to lead to confusion. Similar remarks apply to the following case of enantio- 
morphous states. 



108 COLLISIONS AS MECHANISM OF CHANGE WITH TIME Chap. V 

niorphous states will be of importance to us. The possible existence of 
such states is guaranteed by the principle of dynamical reflectability . 

The foregoing definitions as to what we shall mean by identical, 
congruent, reverse, and enantiomorphous states have been chosen in 
such an obviously natural manner as perhaps to make the detailed 
statements that we have given unnecessary. Nevertheless, we shall find 
these terms quite useful. They can evidently be applied not only to 
states of single molecules, but also to systems consisting of more than 
one molecule, and may be used to characterize the nature of processes 
as well as of states. If we use the symbol i to designate a particular 
state of a molecule, the symbol i' may be used to designate a congruent 
state, the symbol — i to designate the reverse state, and the symbol i 
in the rare cases when necessary to designate an enantiomorphous state. 
Such symbols can be taken when necessary as referring to states which 
are precisely defined except as to spatial location, but will ordinarily 
be used as referring to states defined merely to the extent of corre- 
spondence with an infinitesimal range 8a> as discussed in the foregoing. 

39. Molecular constellations 

We are now ready to turn to the consideration of pairs of molecules 
which are undergoing collision. Introducing the nomenclature of Boltz- 
mann,t such a pair of molecules may be called a comteUXatimi. 

In classifying the different kinds of constellation that can occur wo 
may evidently use the phrases identical, congruent, reverse, or mantio- 
morphous constdlalion as descriptive terms having the same significance 
as appUed above to molecular states. In addition, however, wo shall 
need some further terms which we must now consider. 

As a convenient abstraction, originally employed by Boltzmann, wo 
shall regard two molecules as having a negligible effect oti each other 
until their centres of gravity approach within a critical distance b. In 
our consideration of collisions we shall then regard the collision jw com- 
mencing when the centres of gravity arrive at this critical distance and 
as being completed when they again reach the distance 6 on their way 
apart. In accordance with this procedure we shall now intiwluce the 
term critical cmstettalion to denote a situation where the two molecules 
are at the critical distance apart, and shall introduce the terms initial 
constellation and final consteUation to distinguish between critical con- 
stellations where a collision is just beginning or just ending. 

■f See Boltzmann, VorUiungen aber Gaatheorie, zweiter, unvor&ndertor Abdruck, II. 
Tell, Leipzig, 1912. 



§39 


CLASSIFICATION OF CONSTELLATIONS 


109 


One further descriptive term will also be needed. If we consider any 
given critical constellation it is evident that another critical constella- 
tion could be constructed by interchanging the positions of the centres 
of gravity for the two molecules, or, in other words, by translating one 
of the molecules relative to the other a distance 26 along the line con- 
necting their centres of gravity, so as again to arrive at a situation 
where they would be separated by the critical distance b. If such an 
interchange or translation is carried out, without changing the states 
of the two molecules either as to internal configuration or as to the 
directions of velocities, we shall define the new constellation as the 
corresponding constdlaiion to the one originally taken. 

In the case of two corresponding consteUations, it is evident, if one 
of the two is an initial constellation, that the other could only be a final 
constellation, since if the two molecules were originally approaching 
each other it is obvious that they would have to be receding from each 
other after an interchange of positions without d isturbance of velocities, 
and vice versa if they were originally receding. Furthermore, it may 
be remarked, as will be made evident in the next section, that for any 
arbitrary initial constellation the corresponding constellation would be 
a final constellation with which a conceivable collision actually could 
end, and that for any arbitrary final constellation the corresponding 
constellation would be an initial collision with which a conceivable 
collision actually could begin. 

In order to specify a given critical constellation it is evident, in the 
first place, that we should have to specify the states of the two 
molecules involved, to the extent of describing the external motions 
of the molecules as a whole and the internal configurations and 
motions of their parts. And it is evident, in the second place, that 
we should also have to specify the direction of the line joining their 
centres of gravity, thus describing the angle of attack in the case of 
an initial constellation, or angle of departure in the case of a final con- 
stellation. 

To specify the states of the molecules, in the above sense, we 
could, if desired, give the precise values of the necessary variables 
9'4 — ?»•» Px> Pi—Pr OS'S® of s molecule of r degrees of 

freedom. But for our purposes it will usually be sufficient to regard 
such states as specified by the indices i, j, k, etc., which denote in- 
finitesimal ranges for such values Sco^, 8(Oj, Sco*, etc., as described in 
the preceding section. Since the critical (Stance b has been defined 
as sufficient to make the infiuence of the molecules on each other 



110 COLLISIONS AS MECHANISM OF CHANGE WITH TIME Chap. V 

negligible, it is evident that such possible states would bo tlH)so for a 
free molecule. 

To specify the direction of the line of centres for the critical con- 
stellation under consideration, we could give the precise values for the 
polar and equatorial angles 6 and which together with the critical 
distance 6 as a radius would give the position of the centre of gravity, 
say of the second of the two molecules, with respect to the centre of 
gravity of the first as an origin. But here, too, it will usually bo sufti- 
cient to regard the specification as given only to the extent of an 
infinitesimal solid angle 8A, or corresponding infinitesimal surface 8A 
on the sphere of radius b surrounding the first molecule. 

To denote the different critical constellations that could be specified 
in the above manner we shall use Boltzmann’s symbols of the form 
(i> ^)> where the letters * and j can be thought of as denoting the states 
of the two molecules involved. Such symbols are evidently not com- 
plete since they contain no reference to the angle of attack or departure, 
but they will be sufficient for our purposes. If a given critical con- 
stellation is denoted by {j, i), a possible congruent constellation will be 
denoted by a possible enantiomorphous constellation by 

the reverse constellation by {—j, — i), and the corresponding constella- 
tion by If {j,i) is an irdtial constellation, it will bo noted that 
and ( 2 ,*) will also be initial constellations, while and 

{i,j) win be final constellations, and vice versa if (j,i) is a final (!on- 
steUation. 

40. Molecular collisions 

Having completed what may seem to be an unduly elaborate de- 
velopment of preliminary apparatus, we are now ready to consider 
molecular collisions themselves. To specify any such collision it is 
evident that we should only have to specify the initial cousl^'llai-ion 
with which it begins, since the nature of the collision itself would tlien 
be determined by the laws of mechanics. Thus a specification of the 
initial constellation, say (j, i), with which a collision begins would also 
amount to a specification of the final constellation, say wiih 

which it ends. 

In accordance with this determinate relation between initial and final 
constellations, we may now introduce Boltzmann’s symbols of tlui form 

denote different possible collisions, where the upper letters 
i and J designate, in our previous sense, the states of the two mol(j<tules 



§40 


CLASSIFICATION OF COLLISIONS 


111 


in the initial constellation, and the lower letters h and I designate the 
states in the final constellation. It will be appreciated that such a 
symbol provides a more complete description of the collision intended 
than would be given by the mere symbol (j, i) for the initial constella- 
tion, since by including a designation of the final states h and I we lay 
necessary requirements on the angle of attack, when the collision starts, 
which, as was noted above, have been omitted from our incomplete 
symbols for constellations. 


In using symbols of the form 



to designate collisions, it will be 


convenient to introduce the convention that the right-hand letter in 
the upper row and the left-hand letter in the lower row refer to states 
of one of the molecules, while the other two letters refer to states of 
the other molecule. Thus in the above case i and k are the initial and 
final states of one of the molecules and 3 and I of the other. This con- 
vention corresponds with the idea that we could regard the two mole- 
cules in states i and 3 as first approaching each other in an initial 
constellation which could be more completely symbolized than pre- 
viously by {3 <- i), as then passing through each other’s field of 

influence, and as later separating from each other m states k and I in 
a final constellation which could be symbolized by (-<- k, I -¥■). It is to 
be noted that there is nothing in the idea of a collision to make it 
necessary for the two colliding molecules to be of the same kind, and 
that our symbolism is equally adapted both for oases of like and of 
unlike molecules. 


Considering some given collision symbolized by u’Jj shall regard 

/*•' -v\ ' ’ ' 


a symbol of the form 




as designating a congruent collision, which 


would agree with the original collision after translation and rotation in 
the sense discussed in § 38 . It will be appreciated that an infinite 
variety of collisions congruent to any given collision would be possible, 
and it will be noted that a mere congruence for the initial states with 
which two molecules enter a collision without any specification as to 
angle of attack woxdd not m general be sufficient to secure a completely 
congruent collision. 

Again considering some given collision we shall regard a S3rmbol 
of the form as designating an enantiomorphous collision. An 
infinite variety of such enantiomorphous collisions, using difierent 



112 COLLISIONS AS MECHANISM OF CHANGE WITH TIME Chap. V 


planes of reflection, would be possible in accordance with the principle 
of dynamical reflectability. 

Again considering some given collision we shall regard the 

as designating the reverse collision thereto. 'I’hiH is 

to be understood as the collision that would occur if wo took tlio final 
constellation ik,l) of the original collision and then reversed all the 
velocities involved, both of the molecules as a whole and of their 
internal parts, so as to secure an initial constellation {—k, —1). Jn 
accordance with the principle of dynamical reversibility the molecules 
would then retrace their previous configurations to arrive in the final 
constellation {—j, —i), with the two molecules now in states which 
are the reverse of those with which the original collision started, ft 
may be specially noted that the principle of dynamictai reversibility 
guarantees for any given collision the possible existence of a reverse 
collision, that for any precisely defined collision there is only a single 
such reverse collision, and that the reverse collision does not end with 
the molecules in the initial states which they had in the original given 
collision, but in the reverse of those states. 


Again considering some given collision 



we shall now define the 


corresponding collision thereto as that which would occur if wo (.ook 
the final constellation (k,l), constructed therefrom the corresponding 
initial constellation (I, h), and then allowed that collision to take place 
which would be demanded in accordance with the laws of mechanics. 


To denote such a corresponding collision we may use the symbol ^ 

where x and y denote whatever final states do arise from the collision 
that we have thus prescribed. From the procedure given it is 
evident for any given precisely defined collision that there always 
would be one and only one corresponding collision. 

Finally, again considering some given collision we shall often 

be specially interested in the possibility of returning the two molecules 
from their final states h and I to their initial states i and j. If this can 
be done in a single collision we shall call it an inverse collision to the 


original one. Such a collision can be denoted by the symbol ^.j pro- 
vided we now allow the molecules in states I and k to come together 



CLASSIFICATION OF COLLISIONS 


113 


§40 


with any angle of attack that would assure the desired final result that 
the molecules are to emerge from the collision in states i and j. As we 
shall emphasize later, it is in general not possible to find an inverse 
collision for any given collision. Inverse collisions only exist in special 
simple cases, and are thus unlike reverse and corresponding collisions 
which must exist in any case. 

With the help of the foregoing formalism for the treatment of colli- 
sions we may now consider some interesting properties of collisions and 
constellations. 

As akeady mentioned in the preceding section, for any final con- 
stellation there will be a corresponding initial constellation with which 
a collision might actually begin, and for any initial constellation there 
will be a corresponding final constellation with which a collision might 
actually end. The validity of the first half of the above statement is 
immediately evident, since for any final constellation (ifc, Z) we can 
obviously construct the corresponding constellation (Z, fc), and this will 
then be an initial constellation for whatever collision does occur, as 
we have already remarked in defining corresponding collisions. The 
validity of the second half of the statement is less immediately evident, 
since considering any initial constellation (y, x) it now becomes neces- 
sary to prove that the corresponding constellation {x, y) which we obtain 
by an interchange of centres of gravity reaUy is one that could occur 
at the end of an actual collision. To demonstrate this let us consider 
the constellation {x, y), which can in any case be set up, and construct 
the reverse constellation ( — x, — y). Since the two molecules were 
approaching in (t/, x), they would be receding in {x, y), and thus again 
approaching in (—a;, — y). Hence (—x, — y) is an initial constellation 

that will lead to some collision which we may designate by ( 


By the principle of dynamical reversibility, however, the reverse of this 

collision | ^ | would also be a possible one. This ends, moreover, with 

the constellation of interest (a;,y), thus giving the desired demonstra- 
tion that this is a possible final constellation for an actually possible 
collision. 

As a consequence of the above finding we now see that by starting 
out with the totality of all possible initial constellations with which 
a collision could begin and constructing the constellations that corre- 
spond thereto we should obtain the totality of all final constellations, 
and, vice versa, starting with the totality of all final constellations the 

8595.25 D 



lU COLLISIONS AS MECHANISM OF CHANGE WITH TIME Chap. V 

ooirespondiug constellations would give the totality of all initial ones. 
It will also be noted that similar relations would hold if we started 
with the totality of all initial constellations or of all final constellations 
and constructed the reverse constellations thereto. 


41. The closed cycle of corresponding collisions 

We must now consider an important property of corresponding colli- 
sions which was first discovered by Boltzmann. The result which he 
obtained may be called the (heorem of the closed cycle of corresponding 
collisions, and this theorem proves to be fundamental for the derivation 
of Boltzmann’s if-theorem in a general enough form to bo valid for 
molecules of any degree of complexity. 

Let us write down a series of conceptually possible col}iHion.s 


/2,1\ /4,3\ /6,6\ /8,7\ 
\3,4/ [5,6) 17,8/ \9,10/ •" 


(41.1) 


which is constructed by starting with some arbitiwy collision 
taking next the collision that corresponds thereto, and so on in 


succession setting down each time the collision that corresponds to the 
preceding one in the list. We now wish to prove that such a list would 
have the nature of a closed cycle which would ultimately start t.o 
repeat itself. 

To show this, let us regard the initial constellation of all (2, 1 ), with 
which we commence the series, as specified in the manner already 
described in §39, by giving the infinitesimal ranges and within 
which we take the initial values of the variables ... p^,, p^, p., 

Pi — Pr that specify the internal condition and external momenta for 
the two molecules, and by giving the infinitesimal solid angle 8A, or the 
corresponding infinitesimal area b^BX, within which the centre of gravity 
of the second of the two molecules lies at the start of the collision with 
respect to the centre of gravity of the first molecule as an origin. 
With such a specification of the initial constellation it will then bo 
noted that the final constellation for the first collision would itself 
be specified only to a similar set of infinitesimal lunges 8a>g, S<t>g, and 
8A . So also for the initial and final constellations for all the following 
collisions that we have written down, the specifications will only be to 
within a set of ihfinitesimal ranges. 

In view of the foregoing, it will now be appreciated, with any choice 
of size for our original small ranges Swi, Swj, and 8A, that we shall 



§41 


CYCLES OF COLLISIONS 


116 


ultimately have to come to a collision in the series with a final con- 
stellation that corresponds to the initial constellation for some earlier 
collision that has already appeared in the list, since, with a fixed value 
for the energy of the two molecules and with small but nevertheless 
non-vanishing sizes for ranges of the kind Sco and SA, an unlimited 
number of completely separate possibilities for the final constellations 
is not available.t We now wish to show that the first time when this 
occurs we shall have the final constellation (1, 2) which corresponds to 
the first initial constellation of all (2, 1) and thus gives us a closed cycle. 

To prove this, let us write down our series of collisions (41.1) in the 
somewhat more general form 



(41.2) 


and let us consider any collision in the series that does have a final 
constellation which corresponds to some initial constellation that has 
already appeared earlier in the series. For example, let us assume that 
the final constellation (^-|-5,^-l-6) corresponds to the earlier initial 
constellation (6, 6). Making a formal use of the equality sign to denote 
the identity of constellations or of collisions that are expressed in 
different form but are really the same, we can then write 

(fc-h6,*-l-6) = (5,6), 

and hence = Pt 

[k+6,k+6j \ 5, 6 I 

But we already have an expression for the collision which ends 
with the constellation (6, 6), and hence now obtain 


{k+i,k+i) = (4:,Z) or {k+Z,k+i) = (3,i), 
which gives us for the next to last collision written in the list (41.2) 
lk+2,k+l\ ^ (k+2,k+l\ 

\je-\-Z,k-\-4:l \ 3, 4 / 


Repeating a similar argument, based on the consideration that we 
already have the expression for a collision ending with the con- 
stellation (3, 4), we then obtain 

(*-f2,A!+l) = (2,l) or (*-1-1,A!-{-2) = (1,2), 


t See the end of the present section for a discussion of the degree to which this final 
constellation could be regarded as corresponding to the earlier initial constellation. 



116 COLLISIONS AS MECHANISM OF CHANGE WITH TIME Chap. V 


which finally gives us the desired relation 
/ k, h-l\ 

U+l,A:+2j \1, 2 / 


(41.3) 


This result then shows, starting with any particular place of occurrence 
in the series (41.2) of a final constellation which corrospondH to some 
previous initial constellation, that we could always trace hack to the 
first occurrence of such a place, where the series would he closed hy 
the appearance of a final constellation which corresponds to the initial 
oon stollation for the first collision of all with which the list was com- 
menced. 

This is the desired theorem of the closed cycle of corresponding 
collisions. The theorem will be needed in the next chapter in deriving 
Boltzmann’s If-theorem. 

There is a point in connexion with the above deduction of the closed 
cycle of corresponding collisions that needs further clahoration. Wo 
have based the deduction on the idea that there must at leasi. he .some 
place in the series of collisions (41.2) where wo onciounter a final <ion- 
stellation which corresponds to an initial constellation that has already 
appeared in the list, and have justified this idea by the consideration 
that an unlimited number of completely separate possibilities for final 
constellations would not be available, since they are spoc.ifietl hy ranges 
of the kind 8<d and SA which, though small, would nevertheless ulti- 
mately exhaust the total range of possibilities. Two remarks, however, 
must now be made in this connexion. 

In the first place, we shall actually wish to regard the ranges 8 «i, 
8 ct> 2 , and 8 A, needed for starting off the list of collisions (41.2), as 
difierential quantities that can be made to approach the limit, zero 
as we go to treatments of higher and higher accuracjy. As wo make 
these ranges smaller and smaller, however, it is then ovidetit that wo 
may encounter cases where we have to take a longer and longer list of 
collisions before we get to a final constellation with ranges that overlap 
those of some earlier initial constellation. Hence wo shall have to allow 
the possibility that the list may become infinitely long before the 
desired closure is established. 

In the second place, it is evident, as we look along the list of collisions 
given by (41.2) for a final constellation that does correspond to some 
earlier initial constellation, that we may expect in general to find a final 
constellation that corresponds only incompletely with an earlier initial 
one by a partial overlap of ranges. Thus, returning to the lino of proof 



§41 


CYCLES OF COLLISIONS 


117 


given in the foregoing, it might be that only a part of the precisely 
defined constellations in the range denoted by (/fc+6, &+6) would corre- 
spond to constellations in the range denoted by (6, 6). Hence, when 
we trace back to the constellation h-\-2), we could only conclude 

that there was partial correspondence with the original constellation 
(2, 1). Nevertheless, if this degree of correspondence is not sufficiently 
exact to satisfy us, it is evident that we can always continue to add 
members to the list of collisions given by (41.2) until we ultimately get 
a final constellation which corresponds just as exactly as we please with 
an earlier initial constellation.f Thus, if we again recognize the possi- 
bility that closure may only result when the list becomes infinitely 
long, our theorem of the closed cycle of corresponding collisions still 
remains valid. 

We thus find, for two different reasons, that the length, which we 
have to assign to a list of corresponding collisions before we get closure, 
may depend at least in some cases on the degree of accuracy which we 
demand of our treatment. This complicating circumstance does not 
appear, however, to invalidate the proof of Boltzmann’s H-theorem 
which we shall give in the next chapter. 

42. The closed cycle of two members in the case of spherical 

molecules 

In the case of molecules having states that can be defined by speci- 
fying the external motion of the centre of gravity together with an 
internal condition which is static and exhibits spherical symmetry 
around the centre of gravity, J the foregoing theorem reduces to a very 
simple form, since it can be shown that the whole closed cycle of corre- 
sponding collisions reduces to a series of only two members. The sim- 
plicity of the result depends on the circumstance that the various 
possible states i,j, k, etc., for a molecule can then be taken as depending 
only on the magnitudes and directions of the velocities v^, v^, v;;., etc., 
for the centre of gravity of the molecule in those states, and on an 
internal condition which will not be affected by reversal of velocities 
or by reflection parallel to any plane. 

t Our later application of Liouville’s theorem to collisions will show that there is 
nothing in the nature of a progressive change in the size of the ranges of the kind Sw 
and 8A, which we use in describing constellations, that would still further prolong the 
arrival at a point in the list where the desired degree of precision for the correspondence 
is attained. 

J We thus exclude from our present consideration cases where the state of the free 
molecule would also depend on such factors as rotation or vibration. 



118 COLLISIONS AS MECHANISM OF CHANGE WITH. TIMK Chap. V 


To prove the desired result, let us start with any desired arbitrary 
collision between the two molecules 


3 A 
k,l) 

and then construct the reverse collision thereto 


(42.1) 


{-h,- 

-3> -V 


(42.2) 


This will necessarily have to exist in accordance with the princi})Ie of 
dynanaical reversibility. 

Iiet us next construct a congruerd collision to (42.2) 


(-k', -V\ 

\-3',-ir 


(42.3) 


by considering a rotation through the angle tt parallel to any piano P. 
This will be a conceptually realizable collision since all congruent (iolli- 
sions are equally possible. 

It will now be appreciated, however, that the reversal of velocities 
followed by the process of rotation through the angle tt has (banged 
the original vectors determining the states of the molecules in (42.1) 
into their mirror images with respect to the plane P. And it will also 
be appreciated that the rotation through the angle tt has changed the 
positions of the two molecules into the mirror images — with respect to 
the plane P — of the positions that would have been brought about by 
iuterchanging the original positions of the molecules in the constella- 
tions involved in (42.1). Hence the collision (42.3) will bo soon to liavo 
a character which can be represented by 



(42.4) 


where the constellations (l,k) and {i,i) are seen to he mirror images 
with respect to the plane P of those that would correspond to the original 
constellations appearing in (42.1). 

Let us now construct the eneintiomorphous collision to (42.4) 


*). («.=) 
which can be obtained by reflection with respect to the plane P. This 
will be a possible collision in accordance with the principle of dynamical 
reflectability. It will now be seen, in accordance with the foregoing 
discussion, that this final collision (42.6) is not only the corresponding 
one to the original arbitrary collision (42.1) with which wo started, but 





§42 


CYCLES OF COLLISIONS 


119 


in addition that it ends with a final constellation which corresponds to 
the initial constellation of (42.1). Hence, in the case of molecules having 
such spherical symmetry, we find that the closed cycle of corresponding 
collisions would always reduce to two members, which could be written 
in the form 

(42.6) 


UA (l,h\ 

wl [i,jr 


It is of interest to note in the present case that the final constellation 
(i, j) would correspond exactly to the initial constellation {j, i), without 
any necessity for considering a partial overlap of ranges of specification 
which might occur in some cases as discussed at the end of the preced- 
ing section. It is also of interest to note for this simple case that each 
of the above collisions would be the inverse collision to the other, as 
already defined in §40, since each starts with the two molecules in the 
states which they would occupy at the end of the other collision, and 
returns them to the states with which that collision would start. 

In contrast to this simple result in the case of molecules having 
spherical symmetry, it may now be emphasized in more general cases 
that we cannot expect to find any inverse collision at all for an arbi- 
trarily chosen collision. In these more general cases the above simple 
proof for the existence of the inverse collision fails, since the first step 
of reversing velocities does not lead in general for a non-spherical 
molecule to a state congruent with the original one. And hence the 
following steps of rotation in the plane P through the angle ir and 
reflection through that plane do not lead to constellations that corre- 
spond to those of the original collision. 



Before collision. 



It will be of interest to illustrate the general case of the failure for 
inverse collisions to exist, with the help of a very simple if somewhat 
artificial example. For this purpose let us consider a collision between 
a sphere and a wedge-shaped body as shown in Fig. 1. We take the 
masses of the two bodies as equal, and the point of coUisional contact 
as lying on the line connecting the centres of gravity of the two bodies, 
in order to have the specially simple situations before and after collision 



120 COLLISIONS AS MECHANISM OF CHANCE WITH TIME Chap. V 


shown in the figure. After such a collision it is then evident that no 
possible inverse collision could be constructed which would bring the 
wedge to rest without rotational motion and return tlie s])herc to its 
original motion from left to right. Similar results arc cvidenily to he 
expected whenever the colliding bodies do not have sjthcrifal syminotry, 
and a somewhat complete treatment for the case of ellipsoidal bodies 
has been given by Lorentz.f 

The specially simple possibility in the case of spherically .symmetrical 
molecules, of immediately correlating with any arbitrary collision an 
inverse collision that would return the molecules to their original states, 
provides the reason why the derivations usually preseniod for Holtz- 
mann’s ^T-theorem confine themselves to the treatmetit of collisions 
between rigid, spherical molecules. It appears to have been <!leaiiy 
appreciated by the original discoverer of the theorem, howovci’, that 
this is an artificially simplified case and that it is no(rossary to (tonsider 
a complete cycle of coreesponding collisions in order to obtain a general 
treatment. 


43. Application of conservation laws to collisions 


We are now ready to commence the application of the laws of 
mechanics to collisions. In the present brief section we shall merely 
point out that the ordinary conservation laws of mechanics for energy 
and for the components of linear and angular momentum would, of 
course, hold for any collision. 


If we consider a collision 



the princi[)]o of the conservation of 


energy can be expressed in the form 


(43.1) 

where the symbols e^, e^, ep., ej denote the energies of the two molecules 
in their initial and final states, no expressions for mutual energy of 
interaction being needed owing to oui’ assumption as to the extent 
of separation of the two molecules by the critical distance h at the 
times which we take as marking the beginning and ending of a collision. 

Similarly, as an expression of the principle of the conservation of 
linear momentum for this collision, we can write 


Piac+Pjx — Pla+Plxt 

Pi]/~hPj]/ ~ Pieu~^Piu> (43.2) 

Ptz+Pie = Phs+Plz> 

t Lorentz, Sitmngaber. d. Ahad. d. Wiss. m Wien, 2 Abt„ 95, 115 (1887). 



§43 


CONSERVATION LAWS 


121 


where the symbols denote the initial and final components of momentum, 
parallel to Cartesian axes, calculated for the molecules as a whole from 
their masses and the velocities of their centres of gravity. If of interest we 
could also write expressions for the conservation of angular momentum. 


44. Application of Liouville’s theorem to a collision 


We now come to a more complicated application of the classical 
mechanics to coUisions which will be very important for our further 
work. Since a pair of colliding molecules is itself a mechanical system, 
obeying the laws of mechanics in the Hamiltonian form, it is evident 
that Liouville’s theorem, which was derived from those laws in Chap- 
ter III, should be applicable to collisional processes. We shall now 
make such an application, taking Liouville’s theorem in the form of 
the principle of the conservation of extension in phase. 


Let us consider any collision between two molecules 


which 

Jc,l) 


could be specified by giving the ranges Sto^ and for the states with 
which the first and second molecules enter the collision, and by giving 
the range SA^ for the direction of the line coimeoting the centres of 
gravity of the two molecules — ^i.e. the angle of attack — ^when the colli- 
sion is commencing at time This will give us a range of neighbouring 
processes to which the principle of the conservation of extension in 
phase can be applied. It will first be desirable, however, to have an 
exact description of one of these collisional processes. 

For the purposes of exact description, a fairly complicated notation 
will be necessary. We shall introduce single and double accents, ' and ", 
to denote the values of variables which apply to the first and second 
molecules respectively at the start of the collision, when they are in the 
states partially specified by i and J, and introduce triple and quadruple 
accents, and to denote similar values for the two molecules at the 
end of the collision when they are in states h and 1. In addition we 
shall regard the coUision under consideration as starting at time ti when 
the centres of gravity of the two molecules first reach the critical dis- 
tance 6, and as ending at time when they again reach that distance, and 
shall use the subscripts, ^ and g, in general as denoting the values of vari- 
ables that apply respectively at the beginning and end of the collision. 

With the help of this notation we may now describe a precise colli- 


sion, of the group 



that interests us, by taking 


^ > y s ^ ••• 2r> Px> Psi Pi ••• Pr 


B 


369S.25 


(44.1) 



122 COLLISIONS AS MECHANISM OF CHANGE WITH TIME Chap. V 


as the values for the coordinates and momenta of the first molecule at 
the beginning of the collision, and by taldng 

y", Px> pI Pi - Pr 

as the values for the coordinates and. moiAenta of the second molecule 
at the beginning of the collision. Furthermore, since the centres of 
gravity of the two molecules will be separated by the distance b at 
time when the collision starts, we can write 

x" = *' +6 sin 01 cos ^1, 

^" = ^'+6 sin 01 sin ^ 1 , (4^4.3) 

%” = |8'+6cOS0i 

as a connexion between the positional coordinates for the two mole- 
cules, where 01 and ^i are the polar and equatorial angles which locate 
the centre of gravity of the second molecule, with respect to that of the 
first as an origin, at time 

Similarly, at time when the collision ends, we can take 

r, sT pZ, pZ, pZ, pt ••• pZ (44.4) 

as the values of the coordinates and momenta of the first molecule, 


nJ"> ,/"> «.«<' ^ rJ" nn"' m"" m"" 

» ,y > « » 24 ‘"2r >Px>Py’PB > Pi 


■Pr 


(44.6) 


as the values for the coordinates and momenta of the second molecule, 

a/'" = sin 6^ cos <t>^, 

2^'*' = y«'-(-6sin0aSin^a, (44.6) 


«'"' = z'"+6eos02 ; 


as a connecting relation between positional coordinates, where 0^ and 
^2 are now the polar and, equatorial angles for the centre of gravity 
of the second molecule with respect to that of the first at the later 
time ^ 2 * 

We are now ready to investigate the application of the principle of 
the conservation of extension in phase. To do this, instead of con- 
sidering merely the above precisely defined coUisional process, let us 
now consider a group of such processes of neighbouring character that 
would correspond to the ranges 8a>f, Sai^, and SA^ by which we specify 


what we have called the collision 



By equating the extensions 


in phase for such a group of processes at times and t^, we can then 
obtain the desired application of LiouviUe’s theorem. We must first 
examine suitable ranges to take for the initial, values of the coordinates 



§44 


APPLICATION OP LIOUVILLE’S THEOREM 


123 


and momenta of the two molecules in order to obtain such a group of 
processes. 

Let us begin by considering the ioitial values of the variables for the 
first of the two molecules, as given by (44.1). Since we shall not be 

interested in the particular position where a process of the Mnd ^ 

occurs, we may take the coordinates y', s' that locate the centre of 
gravity of the first molecule at time as having values anywhere withiu 
an arbitrary volume 

= J J J dx'dy'dz’. (44.7) 

We may then complete the specification of initial conditions for the 
first molecule by taking the remaining variables as lying within an 
infinitesimal range . . 

Soti = J ... J ^^4 ... dp', (44.8) 

of the kind we have adopted for use in the specification of collisions. 

Let us now consider the initial values of the variables for the second 
of the two molecules, as given by (44.2). Since we shall be interested 

in processes of the kind must first pick out a range of values 

for the initial coordinates y", z" that will place the centre of gravity 
of the second molecule in a position to secure results of this character. 
For this purpose, let us return to the relation between the positional 
coordinates for the two molecules given by (44.3), and for each set of 
values *', y', z\ that locate the centre of gravity of the first molecule, 
choose a range for x", y", z" such that the centre of gravity of the 
second molecule would enter the critical sphere surrounding the first 
molecule within a range of polar angles to within a range 

of equatorial angles <j>i to a>i^d within an interval of time to 

Using the symbol to denote the infinitesimal solid angle 
sin 01 801 8^1, and the symbol &i to denote the radial componeni of the 
rdative vdocity of approach of the two molecules, we see for each point 
x', y', z' that the above procedure would then place y", z" within an 
infinitesimal volume, which — ^neglecting higher order differentials — ^has 
the magnitude ... 

02 8 Ai;!!i 8 <i = j j j (44.9) 

The nature and location of this infinitesimal volume can be understood 


with the help of the rou/gh diagrammaMc representation provided by 
Fig. 2, where x', y', z' and x", y", z" give a pair of precise initial positions 
for the centres of gravity of the two molecules, which would lead to 



124 COLLISIONS AS MECHANISM OF CHANGE WITH TIME Chap. V 


a collision of the exact kind that was discussed above, and the cross- 
hatching illustrates the infinitesimal volume within which 

the centre of gravity of the second molecule would lie in order to obtain 
our group of closely similar collisions. To complete the initial speci- 
fication of conditions for the second molecule, we may now take the 



At TiME t, At liriE % 

Tig. 2. 


remaining variables for that molecule as lying within an infinitesimal 
range 

8o>j = /.../ ... c^;. (44.10) 

^ With the help of the fo^oing specifications of the ranges for the 
initial values of the coordinates and momenta for our system of two 

molecules we can now write, by combining (44.7), (44.8), (44.9), and 
( 44 . 10 ), 

Pi = J ... J da/dy'dz'dq'i - dpl-dx^dy^dz^dfi ... dp; 

= 21J 

as an expression for the inittal value of the extension in phase for this 
system. 



§44 


APPLICATION OF LIOUVILLE’S THEOBEM 


126 


We must now turn to a consideration of the ranges of final values 
for the coordinates and momenta. 

Let us begin by considering the final values for the positional co- 
ordinates a;'", y'", 2'" for the first molecule. Since all the collisions that 
we are considering are substantially identical in character, the final 
range of values for these variables — ^neglecting second-order differentials 
— ^will be the same as their initial range, thus giving us 

= JJJ da^'d’tfdz!" (44.12) 

with = Vi (44.13) 

as the volume within which the centre of gravity of the first molecule 
would lie at the final time when collisions of the kind discussed above 
would end. To complete the specification of conditions for the first 
molecule at the end of the collision we may now take 

Su}jj. = J ... J dg'4 ... dp^ (44.14) 

as an expression for the infinitesimal range of values, for the other 
variables describing that molecule, which actually would apply at time 
^2 to the processes that we are considering. 

Let us now consider the final values for the positional coordinates 
i/'", 2"" for the second molecule. To study the range in values for 
these coordinates at time ig we note, for any given final position 
i/", jj"' of the first molecule, that a coUision of the exact kind previously 
discussed would put the centre of gravity of the second molecule at 
a precise point y"", z"" located on the critical sphere with z"', y'", z" 
as a centre, as shown in the second half of Fig. 2. The centres of 
gravity of the second molecule in the case of any one of the other 
slightly different collisions which we are considering would then lie at 
time tg in the neighbourhood of this point, as illustrated by the cross- 
hatching in the diagram. Let us now denote by 6®SA2 the infinitesimal 
area, at the above point on the surface of the sphere, through which 
the centres of gravity of the second molecules would pass, denote by 
hg the radial component of the rdatvoe vdocity of separation of the two 
molecules at the end of the collisions under consideration, and denote 
by htg the time inlerval between the passage of the earliest, and latest 
second molecule through the point a:"", y*", a™. It will then be seen 
firom the diagram, neglecting second-order differentials, that we can 

8«2 = J J| d3:""dy'"W' (44. 15) 



126 COLLISIONS AS MECHANISM OF CHANGE WITH TIME Chap.V 

as an expression for the infinitesimal volume ■within which the centre 
of gravity of the second molecule would lie at time for each position 
of the centre of gravity of the first molecule. Furthermore, since is 
the time in'terval between the passage of the earliest and la'test second 
molecule through the point x", y\ z", and is such an interval for 
passage through the point xf", y"", zT, we can also write withm second- 
order terms _ g^^ (44.10) 

To complete the specification of conditions for the second molecule at 
the end of the collision we may now take 

= (44.17) 

as an expression for the infinitesimal range of values, for the other 
variables describing the second molecule, that actually would apply at 
time ^2 to the processes that we are considering. 

We have now completed the specification of the ranges for the final 
values of the coordinates and momenta for our system of two molecules, 
and, by combining (44.12), (44.14), (44.15), and (44.17), can write 

= J ... J dx'”dy"'dz"'d ^' ... dp^' daf'dfW'd^l ' ... dp'^' 

= (44.18) 


as an expression for the fined vcdtie of fhe extension in phase for this 
system. 

In accordance with liouville’s theorem, however, we may now equate 
this final value of the extension in phase to the original value for 
the extension given by (44.11). Doing so, cancelling 6® from both sides 
together with the equal factors given by (44.13) and (44.16), we then 
abtain as the finally desired result 

= k^BX^Sco^Btoi. (44.19) 


rhis important consequence of the principle of the conservation of 


extension in phase applies in general ■to any collision 



between 


iwo molecules, where k^ and ig ar© the radial components of the relative 
relocities of approach and of recession of the centres of gravity of the 
rwo molecules at the beginning and at the end of the collision re- 


[pec^tively, and ■the quan^tities SA^, 8a>^, 8a»y and SAg, Sco/^, ScOf specify 
©spectively the initial and final constellations for the collision, in the 
nanner described in § 39. 



( 127 ) 


45. The probability coefficients for collisions 

Having considered the properties of the different kinds of collisions 
which can take place between a pair of molecules, we now turn to a 
consideration of the frequency with which such collisions could be 
expected to occur in an actual system. Thus far our treatment of colli- 
sions has not involved statistical considerations. To treat our present 
problem, however, we shall now have to introduce the ideas of staiistical 
mechanics, at least in a simple form. 

Let us consider a gas, taken for simplicity sufficiently dilute so that 
we may confine our attention to bimolecular collisions, and let us 
inquire into the number of collisions of the kind specified by the symbol 



which would take place within some time interval of interest t to 


i-f-Si. If we were provided with an exact and complete specification, 
at time t, of the coordinates and momenta for each of the molecules 
composing the system, it is then evident — ^with the help of the exact 
laws of the classical mechanics — ^that we could make in principle a cal- 
culation of the precise number of collisions of the above kind that would 
take place in the selected time interval Si. If, however, we are actually 
furnished with only an incomplete specification of state, we shall then 
instead have to make a calculation — ^with the help of the principles of 
statistical mechanics — of the number of collisions of the above kind 
that can be expected on the average for samples of gas in the condition 
that is specified. 

Let us take the condition of the gas at time t as specified merely by 
its total volume v, total energy E, and by the numbers of molecules 
% and Uj having internal coordinates and external and internal mo- 
menta (24 ... lying in the ranges Sw^ and Sco^ involved in a collision 


of the kind 



that interests us. We assume that no specification is 


given for the positions of any of the molecules. Furthermore, for sim- 
plicity, we take the gas as highly dilute and as enclosed in a fixed 
container located in a field-free region, in order to avoid restrictions 
on the positions of the molecules that would oHierwise result from the 
necessity of allowing for their mutual energy of interaction and their 
potential energy in the field in meeting the requirement as to total 
energy. 

To apply the principles of statistical mechanics, by the methods 
already discussed in Chapter HE, we must first take an appropriate 
representative ensemble of systems, each a gas of the kind under 



128 COLLISIONS AS MECHANISM OF CHANGE WITH TIME Chap. V 

consideration, with their individual phase points so located in the 
phase space as to agree with the above partial specification of state, but 
otherwise distributed uniformly in accordance with the hypothesis of 
equal d priori probabilities for equal regions in the phase space. We 
can then take the average behaviour of the systems in such an ensemble 
as giving a good estimate for the behaviour of the actual system of 
interest; or, more particularly in the present case, we can take the 

average number of collisions of the kind ^ that would occur in the 

systems of the ensemble in the time 8t as an estimate for the number 
that would occur in the actual sample of gas of interest. If we choose 

the most probable number of collisions of the kind ^ in the time 8^ 

as the average of interest, the application of this procedure can be 
made in a very simple manner. 

We must begin by considering the character of the appropriate repre- 
sentative ensemble. In accordance with our methods, the phase points 
for the members of this ensemble are to be taken as uniformly dis- 
tributed in so far as this does agree with the partial specification of 
state provided. In the present case, since the positions of the molecules 
are completely unspecified, and also unrestricted except for their loca- 
tion in the volume v, this means a uniform distribution — ^within the 
limits allowed by v — ^with respect to those axes in the phase space which 

correspond to the positional coordinates for 

the centres of gravity of the various molecules 8. As a consequence of 
this uniformity we can then conclude for samples selected at random 
from the ensemble that the probability P of finding the centre of gravity 
of a particular molecule of interest inside a specified volume St; will be 
given by the ratio 

i> = ^ (4S.I) 


which the specified volume bears to the total volume v of the container. 

With the help of this simple and perhaps obvious consequence of the 
methods of statistical mechanics we can now determine for the systems 
of the ensemble the most probable number of pairs of molecules which 

are so located that a collision of the kind would be initiated 

within the time ht. In order for such a collision to occur, the critical 
sphere of radius 6, surrounding a molecule with coordinates and mo- 
menta ^ 4 ... Py lying in the range Scu^, must be entered through a specified 



§45 


COLLISION PROBABILITIES 


129 


area 6^ SA^ on the critical sphere by the centre of gravity of a second 
molecule with variables Ijdng in the range Scoj, Thus, for any given 


molecule in the range 8co^*, a collision of the kind 



will be initiated 


in the time S^, if the centre of gravity of a second molecule in the range 
Swj is situated within a specified volume SA^ where k^ is the radial 
component of the relative velocity of approach of the two centres of 
gravity. Hence, making use of the above result (45.1) for the probability 
of finding the centre of gravity of a given molecule in a specified 
volume, and making use of obvious approximations for the case of high 
dilution, we can now write 


b^k-t SAt 


(45.2) 


for the most probable number of pairs of molecules which are so 
situated as to lead to a collision of the above kind in time 8^, where 
% and are the specified numbers of molecules in the conditions 
denoted by and 8co^. Or, dividing through by 8^, we obtain for the 

most probable number of collisions ^ ^ occurring per unit of timef 

(45.3) 


It will be noted that the number of such collisions wiQ be dependent 
on the numbers of molecules in states i and but quite independent of 
the states of other molecules. 

In order to put this result into a more useful form it will now be 
desirable to introduce expressions, for the numbers of molecules % and 
in the ranges 8co^ and Scjp which show the explicit dependence of 
such numbers on the size of the ranges selected, and their explicit 
dependence on the time in cases where a steady state of equilibrium 
has not been reached. For this purpose, let us take the expression 

Sn = f(qi ... t) Sji ... 8p, (46.4) 

as providing a general method of specifying the number of molecules, 
of any kind contained in the system, vrhioh have coordinates and 
momenta ... p, lying in the indicated range Sq^ ... at time i. We 
take the function / which specifies the distribution as depending in 


t For simplioity we shall regard the time necessary for the completion of a collision 
as sufficiently short so that it will not be necessary for us to introduce a distinotion 
between the number of collisions ‘initiated’ and ‘occurring’ per unit time. 

3595.25 3 



130 COLLISIONS AS MECHANISM OF CHANGE WITH TIME Chap. V 

general, at a selected initial time t, in any arbitrary way desired on 
the coordinates and momenta ... JJ,. iFor the special case of a steady 
condition of equilibrium the function reduces to the constant form 
nCe-^l'^ independent of the time but depending on the energy 
®(?i ••• Pr)> general expression reduces to the Maxwell-Boltz- 

mftTin distribution law 

Sn = 8qi ... Spf. (46.6) 

In the cases now under consideration, where the positions of the 
molecules are left imspecified, the general expression (46.4) can be 
rewritten in the form 

= /(?4 ••• Pr’ *) SjcSySzS?* ... Spr, (46.6) 

where x, y, and z are the Cartesian coordinates for position. Integrating 
over the total volume v, and over a selected small range 

8a>i = J - J ... dp. 


for the momenta and internal coordinates of a molecule, we then obtain 
expressions of the desired form 

ni=fiV Sa>i and n^=f^v Scdj (46.7) 

for the total numbers of molecules nf and in the ranges and 
that interest us, where and/^ are the values of the ftmotion/ at the 
time t, and at the locations of the regions Sa>{ and Sco^. 

Substituting (46.7) into our previous equation (46.3), we can now 

= (46.8) 


as the finally desired expression for the rate of the collisions that we 
have been considering. This expression applies in general in the case 


of a dilute, homogeneous gas to a collision denoted by 



The 


quantity v is the volume of the container, b is the critical radius defimug 
the beginning and end of the collision, is the radial component of 
the rdative velocity of approach of the centres of gravity of the two 
molecules at the start of the collision, is the solid angla -with respect 
to the centre of gravity of the first molecule as an origin within which 
the centre of gravity of the second molecule lies at the start of the 
collision, and the quantities vf^8a)^ and give the total numbers 
of molecules % and rij which have momenta and internal coordinates 
lying in the ranges and Sco^, these quantities together with 8Ai being 
those that specify the character of the collision. With this under- 



§45 


COLLISION PEOBABILITIES 


131 


standing of the quantities involved, the expression then gives the most 

probable rate, with which collisions of the kind ^ ^ will be initiated, 

in the systems of an appropriate representative ensemble aU of which 
have the above numbers of molecules rif and nj m the indicated con- 
ditions. 

We shall now wish to compa>re this expression for the rate of collisions 
of the kind similar expressions for the rates of colli- 
sions of the reverse kind ^ and of the coirei^onding kind 

(m %)‘ expressions similar to (46.8) can evidently be used for 

any collision, the appropriate expression for the rate of the reverse 
collision will be of the form 

(46.9) 

and for the corresponding collision of the form 


(46.10) 

where we use ^,.8A,. and to designate for these new collisions the 
quantities analogous to the lbi8Ai appearing in (46.8), and the rest of 
the notation will be obvious. 

In order to compare the above three expressions for the rates of these 
related collisions, it will now be necessary to recall the previous equa- 
tion (44.19), derived with the help of LiouviUe’s theorem, which gives 
a connexion between quantities pertaining to the beginning and to the 

end of any collision ^ This connecting relation has the form 

8Ai 8cu{ 8<0y = k^SX^8o)]iS<oi, (46.11) 

where, as quantities pertaining to the end of the collision, 
radial component of the relative velocity of separation of the centres 
of gravity of the two molecules, SAg is the infinitesimal solid angle 
specifying the range of positions for the second molecule on the critical 
sphere surrounding the first, and 8a>/j. and Sco; are the ranges that 
actually apply to the momenta and internal coordinates of the mole- 
cules as a consequence of the kind of collision considered. 

With the help of this relation the remainder of the ai^ument is now 

quite simple. Since the reverse collision | ^ obtained by 



132 COLLISIONS AS MECHANISM OF CHANGE WITH TIME Chap. V 


merely reversing aU velocities at the end of the originally specified 

collision ( /’ flj we must evidently have the relations 
\k,l} 

hr = hi, 8 A, = 8 A 2 , Sw-jfc = 86>*, and 8a)_, = (45.12) 


and since the corresponding collision can be obtained by merely inter- 
changing the centres of gravity of the two molecules at the end of the 


originally specified collision 



we must also have the relations 


hg = ^2 and 8 Ae = 8 A 2 . (46.13) 

Combining with the consequences of Liouville’s theorem given by 
(45.11), and examining our original expressions (46.8), (46.9), and 
(46.10) for the rates of the three related collisions, we then see that 
we can write in general for the case of a dilute homogeneous gas 

m-GfiU 

Zzf;=l=CUU (^14) 


as respective expressions for the most probable roAes of a given collision 


fez) 


, the reverse collision thereto 




and the corresponding 


collision thereto where the probability coefficient C has the same 

\m,nj 


C = (46.16) 

for all three collisions, and the instantaneous values for the numbers of 
molecides %, in different ranges of the type 8 ^^, used in specifying the 
nature of the collisions, are given by expressions of the form 


nt = vfiBu>i. (46.16) 

This final conclusion as to the equal probability coefficients for these 
related collisions wiU be very important in the treatment of Boltz- 
mann’s J?-theorem in the next chapter. 


46. Concluding remarks on molecular collisions 

This now concludes our somewhat elaborate and perhaps tiresome 
treatment of molecular collisions as a mechanism leading to changes 
that take place with time in the condition of actual physical-chemical 
systems of interest. The complicated character of the discussion was 
made necessary by our desire to give a general treatment applying to 



§46 


CONCLUDING BEMABKS 


133 


molecules of any kind, rather than to restrict ourselves to the specially 
simple case of collisions between rigid elastic spheres, as is so often done 
in developments that are given for the classical statistical mechanics. 

It is evident, however, that we cannot regard this general treatment 
of bimolecular collisions in a dilute homogeneous gas as giving a com- 
plete account of all the processes by which actual physical-chemical 
systems change with time from one condition to another. In addition 
to collisions between pairs of molecules there will be collisionB of higher 
order and interactions with radiation, furthermore, although in the 
absence of complete homogeneity we can often apply our expressions 
for the rates of collisions to small regions of the gas where the numbers 
of molecules in different conditions are specified, it is evident that pro- 
cesses such as pressure equalization and diffusion must also be con- 
sidered — ^using known but supplementary methods — ^in order to obtain 
a complete picture of the time behaviour of physical systems. Never- 
theless, bimolecular collisions provide a sufficiently typical and im- 
portant mechanism of change, so that their use in connexion with the 
derivation of the N-theorem in the next chapter will be appropriate 
for giving a fundamental understanding of the nature — ^if not a com- 
plete picture — of the changes that take place with time in the condition 
of actual physical-chemical systems. 

There is, of cornse, also another serious limitation on the general 
applicability of the methods of the present chapter in that we have 
based them on the classical rather than on the quantum mechanics. 
Particularly in the case of the internal coordinates and momenta of 
a molecule we must expect many cases where the classical mechanics 
gives a very poor approximation for the more strictly valid results of 
the quantum mechanics. This difficulty will be remedied when we come 
to the analogous quantum mechanical part of our development of 
statistical mechanics. 

In this connexion it is also of interest to note that the quantum 
mechanical treatment, which we shall give to the changes in systems 
with time in Chapter XI, will be much more complete in character than 
the considerations of the present chapter. We shall, to be sure, also 
there give a quantum mechanical treatment to collisions as a mechan- 
ism of change, but on account of the greater power and simplicity of 
quantum mechanical as compared with classical methods, it will be 
found possible to supplement this with somewhat formal but never- 
theless quite general treatments applicable to any kind of processes 
that take place with time in actual physical-chemical systems. 



VI 

BOLTZMANN’S N-THEOREM 

47. Definition of the quantity H 

In Chapter IV we have studied the Maxwell-Boltzmann expression 
which describes the distribution of molecules in different states that 
can be expected to prevail under conditions of macroscopic equilibrium; 
and in Chapter V we have made a detailed study of collisions as an 
important mechanism which leads to the change of molecules from one 
state to another. In the present chapter we are now ready to study 
Boltzmann’s famous N-theorem, demonstrating the actual tendency 
for the molecules of a system to approach their equilibrium distribution 
when started off in any desired arbitrary manner. The derivation of 
this theorem and the appreciation of its significance may be regarded 
as among the greatest achievements of physical science. 

Tn this section, as a preliminary task, we must consider the definition 
of a quantity, called H by Boltzmann, which, as we shall see, can be 
regarded as providing an appropriate measure of the extent to which 
the condition of a system deviates from that corresponding to equili- 
brium. The later proof of the if-theorem will then consist in showing 
the actual tendency for this quantity to decrease with time to a mini- 
mum, and thus for the system to approach its equilibrium condition. 

To obtain the desired definition we must first have a general method 
of specifying molecular distributions, suitable for use whether equili- 
brium has been attained or not. For this purpose we may again intro- 
duce our previous expression (45.4) for the number of molecules, 

= /(2i ... *) ki - ^Pr> (47.1) 

in any small region of the /i-space denoted by ... where the 

fimction / depends on the coordinates and momenta for the 

molecules under consideration, and on the time t, in whatever manner 
that is needed to describe the distribution. In the special case of equi- 
librium, the function/ would then assume the form nOe-^l^^, and (47.1) 
would reduce to the Maxwell-Boltzmann distribution law, 

8» = nCe-‘l^ Sji ... 8p„ (47.2) 

where the energy e depends on the coordinates and momenta ... 
but not on the time t: 

To use such specifications of molecular condition for the definition 
of the quantity H, let us now regard the total ju-space, for each kind 



§47 


DEFINITIOH OF ff 


136 


of molecule present in the system, as divided up into a collection of 
small finite regions all having equal magnitudes of extension 

SVf, = Sell - 8j)r (4:7.3) 

in the /*-spaoe. In this way we are provided, as in § 27, with a collection 
of equal ‘cells’ in the /x-space, which we can think of as labelled by the 
integers 1, 2, 3,..., i,... . In agreement with (47.1) we can then write 

»i=/i(#)S?i...8p, (47.4) 

as a general expression for the number of molecules in the ith such 
region at time t. 

With the help of these quantities we can then define the quantity 
ff for our system by 

^ SSi ... 8p^, (47.6) 

i 

where we take a sum, for each kind of molecule present, over all the 
regions i, of equal magnitudes Sji ... Sp^, into which we regard the 
/x-spaces for the different kinds of molecules as divided. Or, replacing 
summation by integration over the whole of the jtt-spaces, we can also 
write our defining expression in the form 

•S' = J - J f log fdSi ... (47.6) 

provided we realize that we must treat the differential ranges 
as of high enough order so that no trouble arises from regarding the 
number of molecules/ dg^ ... in such ranges as a continuous function 
of position in the /t-space. Or, in accordance with (47.3-6), it will be 
seen that we can also express ff in terms of the numbers of molecules 
present in the different cells i in the form 

if = ^ (WilogWi— %log8vp) 

= ^ «^log%+oonst., (47.7) 

i 

where we take a summation over aU such numbers of molecules np 
This last form of expression is useful in showing, for any specified 
condition of an isolated system, the close relation between our new 
quantity ff and our earlier quantity P, giving the probability of finding 
a system chosen at random from the corresponding microcanonical 
ensemble in the specified condition. Comparing (47.7) with our previous 
expression (28.4), we can write 

ff = — logP+const., (47.8) 

where the constant term only contains quantities of a kind that do not 
depend on the different conditions of any given isolated system. In 



136 


BOLTZMANN’S H-THEOREM 


Chap. VI 


aiCGorda/iice ■with this rela'tioii we now see that the quantity H has been 
defined, in such a way as 'to give a measure of the extent to which the 
condition of an isolated system deviates from equilibrium, since de- 
creases in this new quantity H would correspond to increases in the 
quantity logP, which assumes its maximum possible value at equili- 
brium. It is also to be noted, for a given isolated system — ^with specified 
energy content, volume, and total number of molecules — ^that there 
would be a TniniTmim possible value of H corresponding to our pre- 
viously investigated maximum possible value of logP. Hence the 
approach of a system towards the condition of equilibrium may be 
regarded as corresponding to a decrease in H towards its minimum 
possible value. 

One further form of expression for H will also prove useful in a later 
chapter — see § 102 — ^when we -wish to set up the quantum analogue of 
the classical quantity H. In accordance with § 28, the probability P 
for any condition of interest is to be taken proportional to the number 
of regions of equal volume Zvy in the phase space for the system which 
correspond to that condition. Hence, using 0 to denote the number of 
such regions, we can also write our expression for H in the form 

H = — log G^-l- const., (47.9) 

where Q may be conveniently spoken of as the number of classical 
states — defined by the equal volumes SUy — ^which correspond to the 
condition of inteorest. 


48. Derivation of the -theorem 


(a) Rate of change of H with time. We axe now ready to consider 
the verification of the H-theorem, which asserts an actual tendency for 
S to decrease with time, and hence for the molecules of a system to 
approach their equilibrium distribution. To do this we shall first obtain 
a general formal expression for the rate of change in H with time, and 
then show the effect of actual molecular processes in contributing to 
a n^ative value for this rate of change. 

Let us start with our definition for the quantity H in the form 


^ / /log/<*2i ... (48.1) 

where the function deteraunes at time t the distribution in 
the ^-space for each kind of molecule present, in accordance with the 
expression ^ ^ ^ ^ 



§48 


CHANGE OF H WITH TIME 


137 


If we now differentiate (48.1) with respect to t, we can write for the 
rate of change in H with time 

w “ j - j (I ‘“*•^+1) 

since / is the only factor that depends on the time. Noting, however, 
the significance of the second term dfjdt that appears on the right-hand 
side of (48.3), we appreciate in accordance with (48.2) that the integra- 
tion of this term over all values of the coordinates and momenta would 
give an expression for the rate of change with time in the total number 
of molecules in the system. And this can be taken equal to zero, since 
we shall only consider systems composed of a fixed number of mole- 
cules. Hence we can now write 

as a general, formal expression for the rate of change of H with time, 
in terms of the function /, which specifies the distribution of molecules 
among their own different states, and in terms of its time rate of 
change dffdi. 

Starting with this formal expression for the time rate of change in 
H, a general derivation of the H-theorem would now consist in showing, 
once and for all, for every possible Mnd of molecular process, and for 
any possible specification of the distribution function /, that we could 
expect values of dfjdt which would make H decrease with time to its 
TTiiTiiTifmTn equilibrium value. With the present method of attack, how- 
ever, no such general derivation appears feasible, since each kind of 
molecular process, that can lead to changes in the distribution/, needs 
what amounts to a separate investigation. Hence we now turn to a 
special consideration of the effect of collisions as an important and 
typical mechanism determining the change of H with time, thereby 
specializing to sufficiently dilute systems so that bimolecular collisions 
can be considered as a weU-defined source of change. We follow this 
with some qualitative remarks as to other mechanisms. And we shall 
later consider more general methods of attacking the approach of a 
system to equilibrium, in § 51 of this chapter for the case of the 
classical mechanics, and in §§ 105 and 106 of Chapter XTT for the case 
of the quantum mechanics. 

(6) Effect of collisions on change of H with time. To study the effect 
of collisions in leading to changes in H with time, let us consider a dilute 
gas composed of one or more kinds of molecules, enclosed in a fixed 

3596.26 m 



138 


BOLTZMANN’S H-THEOBEM 


Chap, VI 


contSiiuor of volumo v in a region where no external fields of force are 
acting, having a total energy E, and having for each kind of molecule 
a distribution over molecular states which is specified at time t by an 
expression of the form 

8n = /(g'4 ... Pr> 0 8*82/82 8^4 ... (48.5) 

consonant with the specified value E of the total energy. We purposely 
take the distribution function / independent of the positional variables 
a;, y, 2 in order that we may concentrate our attention solely on the 
effect of collisions, using the methods of Chapter V, and leave for later 
mention other processes such as pressure equalization and interdiffusion 
which could also contribute to changes in in the case of a non- 
imiform spatial distribution. In accordance with the circumstance that 
the function/ does not depend on the coordinates x, y, z for the positions 
of the molecules, we may then revsrite (48.6) iu the sunplified form 

8n = Sv8a), (48.6) 

where 8® represents an infinitesimal range m position, and 8a) represents 
an infinitesimal range in the internal coordiuates and external and 
internal momenta ... p^. Using this simple specification of the dis- 
tribution, our general formula (48.4) for the rate of change in H with 
time now assumes the form 

= JJ^*nog/(a,,<)d®da> 

= « J log/(a), /) do), (48.7) 

where the integration may be regarded, if necessary, as including terms 
for more than a single kmd of molecule. 

Bietuming to the methods of treating collisions developed in the pre- 
ceding chapter, we are now ready to consider the effect of any particular 

collidon— say in contributing to the above expression for the 

rate of change of H. In the first place, in agreement with (45.7), we 
can write 

nt = vfi8o)i, «y = ®/,8a)^, nk = vf^8o>^, ni = vfi8a>i (48.8) 
as egressions for the numbers of molecules m those ranges— 80)4, 8cof, 
8a)j, and 8a), — from and to which such a collision leads, where Z^, /^, fj,., 



§48 


CHANGE OF H BY COLLISIONS 


139 


and fi are the values of /(a>, t) at the location of these ranges and at 
the time of interest L In the second place, in accordance with (45.14), 
we can write (48.9) 

as an expression for the most probable number of such collisions per 
unit time, where the value of the coefficient G will be considered later. 
Hence, since each such collision would result in the transfer of a pair 
of molecules from the ranges and 8 a>y to the ranges 8 ce>^ and 8 co;, 
we can now write 7 , - 


diii dfj ^ 

= +t=+“t«"* 

as an expression for the probable rates of change in the quantities listed 
at the right. Comparing these expressions with the integral (48.7), 
which gives the total rate of change in H, we see that we can then take 


= Gfi M-logfi-logf^+logf^+logfi) 


(48.11) 


as an expression for the probable contribution to this total rate of 
change that would be made by collisions of the kind ^ 

This contribution to dHfdt could, of course, be either positire or 
n^ative, according aa f^fi is greater or less than f^fj. Hence no im- 
mediate conclusion as to the sign of dHjdt can be drawn until we see 
the result of combining the effect of this particular collision with that 
of others which woxild also be taking place in the gas. In this connexion 
we may now ^ve separate treatment, first to the specially simple case 
of collisions between spherical molecules, and then to the more com- 
plicated case of collisions between molecules in general. 

In the case of spherical molecules, it has been shown in § 42 that the 

existence of any collision ^ implies the existence of another coUisiqn 
(^’ which is both the corresponding and the inverse collision thereto. 



140 


BOLTZMANN’S iJ-THEOBEM 


Chap. VI 


iPTirtherDOLorej it has been shown in § 46 that the probability coefficient 
O w’oiild have the same value for the rates of occurrence of any collision 
and the collision that corresponds thereto. Hence, in analogy to (48.11), 
we may now also write 

= (48.12) 

IJc JkJl 

with the same coefficient (7, as an expression for the probable contribu- 
tion to the total rate of change in H, which would be made by the 
inverse collision to that first considered. Adding (48.11) and (48.12) 
together, we then obtain as an expression for the combined effect of 

the pair of inverse collisions ^ and 

^ (48.13) 

This result, hoTrever, is of the form 

C(x-y)log^, 

X 



where C, x, and y are all essentially positive quantities. Hence the 
total product itself can only be equal to or less than zero, since with 
X greater than y the factor log{y/«) would be negative, and with x less 
than y the factor (x—y) would be negative. Thus, since every possible 
collision in the case of spherical molecules could be considered along 
with its inverse, we now obtain the final result 


dt 


0 , 


(48.14) 


as a description of the combined effect of all collisions, the equality 
sign appl 3 diig in accordance with (48.13) only when we have a distribu- 
tion which makes f J J> Jf 

jiJi — JkJi (48.16) 


for all possible collisions 



In the more complicated case of collisions between non-spherical 
molecules the above treatment has to be modified, since we have seen 
in § 42 that an arbitrary collision will in general have no inverse collision 
at all. In this general case it then becomes necessary to consider the 
combined effect of the whole closed cycle of corresponding collisions 


(2,l\ /4,3\ /0,6\ /jfc,ifc-l\ 



§48 


CHANGE OF E BY COLLISIONS 


141 


which, as shown in § 41, can always he set down as a closed list taking 

any desired arbitrary collision ^ as a starting-point. lyrfl.Tring use of 

the equality of values found in § 46 for the probability coefficients C 
that appear in the expressions for the rates of corresponding collisions, 
and making use of the general form of expression (48.11) for the con- 
tribution of any particular collision to the probable rate of change in 
H, we can now take 

(48.16) 

as an expression for the combiued effect on the quantity H of aQ the 
collisions that would appear in such a closed list. 

This result is of the form 



where the quantities C, a, j8, y, S, e,..., fi are all essentially positive, and 
can itself readily be shown to have a value that can only be equal to 
or less than zero. To see this we begin by noting that somewhere in 
the series of quantities a, )8, y, 8, e,..., /x there wiQ have to be some 
quantity which is not greater than either of its two neighbours. Let us 
assume y to be such a quantity, and rewrite the above result in the form 

Since we have picked out a quantity y such that 

we now see that we have replaced our oiiginal result by the sum of 
a term which cannot be greater than zero and a term of exactly the 
same form as before but containing one less factor in the argument of 
the logarithm. Proceeding in this fashion, it is then evident that we 
should finally obtain a sum of terms none of which could be greater 
than zero, and have thus obtained the desired proof that the total 
result can only be equal to or less than zero. 

Hence, since any collision whatever could be treated as a member of 
a similar closed cycle of collisions, we can now conclude — as in the 
special case of spherical molecules — ^that the combined effect of all 
collisions would be to give 



(48.17) 



142 


BOLTZMANN’S H-THEOREM 


Chap. VI 


as the most probable rate of change of H with time. It will also be 
seen from the foregoing that the equality sign would again apply only 
in the case of a distribution giving 

(48.18) 

. This, moreover, is a relation, as we 

shall see in detail in § 50 (6), that would characterize the condition of 
the system when H has reached its minimum possible value and the 
molecules of the system have attamed their equilibrium Maxwell- 
Boltzmann distribution. Furthermore, it will also be seen from the 
foregoing that the negative values, predicted for dH/dt by (48.16), 
would be greater and greater the larger the deviations from the equili- 
brium relation (48.18). Hence we can say in a qualitative way that 
the probability of finding a negative value of dHIdt becomes more and 
more nearly certain for values of H farther and farther above the 
mininmiyyi . 

We thus complete the desired demonstration of the H-theorem for 
the general case of bimolecular collisions between any kinds of mole- 
cules, and may regard the results obtained as typical for other kinds 
of processes as well.f 

(c) Effect of other processes on change of H with time. In addition 
to bimolecular collisions as a mechanism that leads to changes in the 
condition of physical-chemical systems, by changing the states of the 
pairs of molecules involved, it is evident that other processes can also 
contribute to changes in condition. Hence, as already remarked above, 
a complete verification of the H-theorem would necessitate an investiga- 
tion of the effect of every such process. In this connexion, nevertheless, 
we shall content ourselves with a few qualitative remarks as to other 
processes. We need feel no hesitation on the score of this incomplete- 
ness, however, since we shall later give more general methods of 
attacking the approach to equilibrium, both for the classical and for 
the quantum mechanics, as we have already remarked above. 


for all possible collisions 


t la the onginal development of the H-theorem by Boltzmann, interest was primarily 
centred on the effect of collisions in changing the internal conditions and external 
momenta of molecules, and the treatment was thrown into such' a form that H was 
not regarded M changing as a result of molecular motions not involving collisions. See 

Leipzig, 1910 and 1912, Part I, § 18, footnote 2, 
^ ^ procedure of Boltzmann is not appropriate, however, if we take 

the condi&on of a system as specified by the numbers of molecules in small but finite 
^T^f^to^^l95 Ehrenfest, SJncyU, d, Malh, Wise, IV. 2, ii, Heft C, 



§48 


CHANGE OF H BY VARIOUS PROCESSES 


143 


Collisions of higher order than the second, where more than two 
molecules would simultaneously come close enough to influence each 
other, immediately suggest themselves as furnishing a kind of process 
which cannot be neglected. The treatment of such higher order colli- 
sions makes it necessary to introduce additional complications in 
deflning critical constellations, in discussing the closed cycle of corre- 
sponding collisions, in applying Liouville’s theorem, and in proving the 
equality of coefficients for the probability of occurrence of correspond- 
ing collisions. Nevertheless, it appears possible to carry out deductions 
quite similar to that given above for bimolecular collisions. For 
example, in the case of trimolecular collisions we can expect an expres- 
sion of the form 

+fkfk+ifk+2^0g ] (48.19) 

JkJk+lJk+il 

for the contribution to the probable value of dH Jdt, which would now 

be made by an appropriately selected closed cycle of corresponding 

collisions. And this can be shown to be essentially a negative quantity, 

with the value zero only when the distribution is such as to give 

equalities of the form j f j> /aq orw 

JiJjIk— JlJmJn 

for all possible trimolecular collisions. In this manner we can obtain 
the desired verification of the ff-theorem for coUisional processes of 
higher order than the second. 

The two familial processes, of ^esswre eqaalizaiion for a gas as a 
whole and of interdiffusion of its constituents in the case of a mixture, 
also need mention. Starting with any non-uniform distribution of pres- 
sure or concentration, the end result of these processes, neglectmg for 
simplicity the efifeot of external fields of force, is well known to be the 
establishment of a uniformly distributed, uniform mixture of the con- 
stituents. Two remarks may now be made in connexion with such 
processes. 

In the first place, it will be appreciated that the principles of statisti- 
cal mechanics would themselves indeed lead us to expect a general 
tendency for gases to distribute themselves uniformly. The application 
of these principles would consist in examining the average behaviour of 
a representative ensemble of systems so chosen that each member of the 
ensemble would exhibit any lack of uniformity of distribution specified 
for the actual gas of interest. With any non-uniform spatial distribution 



144 


BOLTZMANN’S H-THBOBEM 


Chap. VI 


it is evident, however, that highly artificial restrictions on precise 
molecular positions and velocities would be necessary to prevent a net 
transport of molecules from regions of high concentration to those of 
low. Hence, since no such special restrictions on positions and velocities 
would be warranted by any actual knowledge that we should have as 
to the gas of interest, substantially every system in the representative 
ensemble would exhibit a tendency to change in the direction of more 
uniform distribution, and this same tendency would then be predicted 
as probable for the sample of interest itself. 

As a second remark, it is to be emphasized that the occurrence of 
such spontaneous changes, in the direction of iaoreased uniformity 
of spatial distribution, would be accompanied by decreases with time 
in the quantity H. To show this let us return to our previous general 
expression (48.4) for the rate of change of H with time, 

^ = J ... J ^log fdacdydz % ... (48.21) 

TC-ganfiiTiiTig this equation, we then see at least qualitatively that dH[dt 
would be negative, when changes are taking place that make the 
distribution fimction / depend in a more uniform manner on the 
spatial variables *, y, z, since, by and large, the occurrence of such 
changes would imply negative values of dfjdi in spatial regions where 
/ is large and positive values where / is small. To take a specific 
example, let us consider the transport of molecules between two equal 
elementary cells in the jti-space, SviSco and SvaSco, which correspond to 
equal but difrerently located ranges and Svg for the positional co- 
ordinates X, y, z of the molecules, and to the same range 8a> for the 
remaining variables ... p, that determine the condition of the mole- 
cules. Using and to denote the values of the distribution function 
corresponding to these two cells, we could then write 

— = ScjSto^log/i-fSvaScu^log/a = const, x^log'^ 

% 8 . 22 ) 

as an expression for the contribution to the total rate of change in H, 
which would coire^ond to the tranq)art of molecules between the two 
cells, where the second form of writmg is made possible by the con- 
sideration that any increase in would be at the expense of /g. We 
immediately see, however, that this expression would indeed be nega- 
tive, ainoe with/j greater than/j the direction of spontaneous transport 



§48 CHANGE OP H BY VARIOTJS PROCESSES 146 

would maJke df-^Jdt negative, and with /j less than the factor log{fJf^ 
would itself be negative. Hence we can thus obtain also for the pro- 
cesses of pressure equalization and interdifhision the desired verification 
of the jEf-theorem. 

There are, of course, many other important, spontaneous processes 
such as fusion, vaporization, chemical reaction, and the interactions 
between matter and radiation which might be considered in order to 
extend our verifications of the ff-theorem. Further investigations along 
present fines would hardly be justified, however, in view of the limited 
range of validity for classical methods, and m view of the more general 
treatment of the approach towards equilibrium which will be given 
later. 

In conclusion, it may be noted that the decrease of H to its final, 
lowest possible value would depend in general on the co-operation of 
more than a single process. Hence a complete treatment of the rate 
of change of H with time would involve a combination of the treat- 
ments given to the different individual kinds of process. Thus, for 
example, our treatment of the effect of bimolecular collisions in leading 
to a redistribution of molecular states in the case of a homogeneous 
gas can be readily extended to give an account of the effects of collisions 
in a non-homogeneous gas by regarding the gas as divided up into 
elementary volumes taken small enough to be treated as homogeneous, 
but this treatment then has to be further supplemented by including 
an account of those processes of pressure equalization and of inter- 
diffusion, which in the absence of external forces would finally lead to 
a uniform mixture for different kinds of molecules and of moleciilar 
states. Similarly, the treatment of the effect of collisions in changing 
molecular states has to be supplemented by a treatment of the effects 
resulting from the absorption and emission of radiation. For this pur- 
pose it proves possible, guided by ideas from the quantum theory of 
radiation, to define a quantity H which depends not only in the pre- 
vious manner on the numbers of molecules in different states, but also 
contains further terms depending on the condition of the radiation 
field.f It then becomes possible to show that this combined H would 
decrease to a TwinimuTn value in such a manner as to require the simul- 
taneous satisfaction of the Maxwell-Boltzmann distribution law for 
molecules and of the Planck distribution law for radiation, thus pro- 
viding a combined derivation of these two laws. 

t See Tolmaji, Jowm, of the Frcmhlm Inst. 203* 814 (1927) ; and JStatisticdl Mec^nics, 
New York, 1927, § 246, p. 198. 

3595.25 jj 



146 


BOLTZMANN’S H-THEOBEM 


Chap. VI 


49. Discussion of the H-theorem 

{a) Statistical character of the if-theorem. We must now undertake 
a somewhat lengthy and detailed discussion of the foregoing J?- theorem 
in order to have a clear appreciation of its significance. For this purpose 
it is especially important to recognize the character of Boltzmann’s 
result as a theorem of statistical mechanics which makes a reasonable 
prediction as to the future behaviour that may be expected for a system 
in an incompletely specified state, rather than as a theorem of exact 
mechanics which could furnish an exact prediction as to the precise 
behaviour of a system starting from a precisely specified state. The 
lack of appreciation of this distinction has often led in the past to 
seeming paradoxes and difficulties. 

As we have just suggested, the statistical character of the J?- theorem 
depends in the first place on the circumstance that we treat systems 
in states that are only incompletely specified, and in the second place 
on the consideration that we are thus forced to employ the methods of 
statistical rather than of exact mechanics. 

The incomplete nature of the specifications is inherent in the kind of 
descriptions which we use for the different conditions of a system that 
interest us. These descriptions are given by expressions (47.1) of the 

Sn = f(qi ... Pr, t) 8qj_ ... Sp^, (49. 1) 

where hn is the number of molecules with coordinates and momenta 
in the range ... and / is taken at the time of interest as a func- 
tion of the coordinates and momenta Such an expression, 

however, does not give an exact specification of the state of a system, 
since we regard the various ranges ... Sp,, as having a small but 
nevertheless finite extension, corresponding to an approximate observa- 
tion of the condition of the system which might actually be made in 
practice. Indeed we treat the ranges ... as infinitesimals of an 
order large enough, so that 89^ can in general be taken as a large number 
and/ as a continuous function of the variables g^ ... p^. Hence no exact 
specification of the positions, velocities, and internal conditions of the 
molecules is provided. 

In view of this incomplete specification of state we are then forced 
to employ the methods of statistical mechanics, when we come to com- 
pute the temporal behaviour of the quantity of interest 

- J /log/ dq^ ... (49.2) 

where it may be emphasized that this integral is itself to be regarded 



§49 STATISTICAL CHARACTEE OF THEOREM 147 

as the limit of a sum of terms of the form /log/ ... hpf in which the 
ranges ... are treated as infinitesimals of the order mentioned 
above. As a consequence, our final theorem as to the decrease of H 
with time has the statistical character of being a statement only of 
what can be expected on the average for a system in the condition that 
has been described. 

It will be of interest to follow the statistical character of the con- 
siderations in detail in the case of the treatment which we have given 
for the effect of collisions in leading to changes of H with time in a 
homogeneous gas. The condition of the gas at the time of interest t was 
there described by an expression (48.5) of the form 

8^ = /(?4 -Pn i) 8^:82/82; 8^4 .. . hp^, (49. 3) 

where the function / was taken as independent of the positional co- 
ordinates y, z. This description of condition is not sufficient for a 
precise specification of state, since it does not prescribe precise values 
for coordinates and momenta but only gives the numbers of molecules 
which would have values of those quantities falling within infinitesimal 
ranges of large enough order so that / can be treated as a continuous 
function. In particular it is to be noted that the lack of dependence 
of / on X, y, and z only means a mcbcroscopic homogeneity of spatial 
distribution which puts no restrictions on the number of pairs of mole- 
cules that would have mutual positions such as to initiate a collision 
of any kind of interest. In view of the lack of precise specification, we 
then have to employ the methods of statistical mechanics to predict 
the expected changes with time in the condition of the system. This 
we do by using our previous expressions for the average (most probable) 
rates of collision in a representative ensemble of systems, all of which 
are in states that agree with a description of condition of the kind 
(49.3), but are otherwise distributed in accordance with the hjrpo- 
thesis of equal a priori probabilities for equal regions in the phase 
space. Hence the final use of these expressions in calculating the 
value of dHjdt will only give us a result which can be expected to 
hold on the average under the circumstances so fax as they have been 
specified. 

As the result of the foregoing discussion, we now recognize in general 
the statistical character of the H-theorem. In accordance with this 
theorem, a system in a condition that does not correspond to the lowest 
attainable value of H wiU exhibit on the average a negative value of 
dHjdt, but this does not have to be true in each individual case. The 



148 


BOLTZMANN’S N-THEOBEM 


Chap. VI 


recognition of this possibility for exceptional behaviour is important in 
clariJfyiDg difficulties that have sometimes arisen in imderstanding the 
H-theorem, as v?ill be seen in what follows. 

In it may be remarked that the specific calculations, which 

were made in § 48 (6) for the effect of collisions, were concerned with 
the most probable value of dH/dt, as the average to be computed for 
the members of the representative ensemble. The choice of this parti- 
cular average is based, however, merely on considerations as to mathe- 
matical simplicity and historical famihanty. It would also be possible 
to consider other averages, for example to study the mean value of 
dHjdi and the fluctuations around this mean, in the case of different 
of systems and processes. 

(6) Observations on the continued decrease of H with time. In 
accordance with the foregoing, if we have an isolated system at a given 
initial ti-mpi in a condition which does not correspond to the lowest 
possible value of H, we can expect the value of IT to be decreasing with 
the time. Hence it seems natural to conclude that the condition of such 
a system could be expected to continue to change with time in the 
direction of lower and lower values of H until the minimum possible 
is attained. Nevertheless, it will be necessary to examine the grounds 
for this conclusion somewhat more in detail, since our actual finding 
that the most probable value of dHJdt would be negative at the initial 
time ^ cannot be regarded as immediately equivalent to an assertion 
as to the long time behaviour of H, for which some appropriate form 
of integration over the time would be needed. Indeed, since we realize 
that occasional positive values of dHjdt are not eliminated by statistical 
considerations as to the most probable value of that quantity, the 
possibility has to be entertained that the very systems which exhibit 
a negative value of dHldt at some ioitial time ^ would be the actual 
ones most liable to exhibit positive values of that quantity at a later 
time. 

A justification for the above conclusion as to the lo3ag time behaviour 
of H, which is at least partially satisfactory, may be obtained by con- 
sidering the nature of the succession of actual obseirvations which would 
be needed in order to follow the behaviour of H as dependent on time 
in the case of any given system of interest. To carry out such a succes- 
sion of observations, let us take some suitable isolated system, and, 
assuming no prior knowledge as to its history or condition, let us make 
an appropriate observation at an initial time ^ on the distribution 
of its component molecules, from which we can then compute the 



§49 


CONTINUED DECREASE OE H 


149 


corresponding value of Let us assume that this initial value of 

H is much larger than the lowest possible value. 

On the basis of this information let us next consider what can be 
expected as to the future behaviour of the system. Since the above 
initial determination of molecular distribution would have — ^in accord- 
ance with our previous discussion — only an approximate character, it 
will not be smBficient for an exact specification of state, and we shall 
have to employ the methods of statistical rather than of ordinary 
mechanics in order to predict the future behaviour. This we do by 
considering a representative ensemble of systems set up so that all of 
its members are in states that agree with the initially observed con- 
dition, but are otherwise distributed in accordance with our funda- 
mental postulate as to equal a priori probabilities. By considering the 
behaviour in this ensemble we can then calculate the average (most 
probable) value of dHjdt for the members of the ensemble. Since we 
find that this average value is a negative quantity, we then predict 
that the value of dHIdt will also presumably be negative for the actual 
system of interest, and indeed in cases where H is very much larger 
than its minimum value the prediction that dHjdt will be negative can 
be taken as almost certain. 

So far our considerations have not gone beyond the original argu- 
ments leading to the expectation of a negative value of dHjdt when 
H has a value greater than its Tniuimum . Let us now, however, proceed 
a step faorther by using our average value of dHjdt to make a prediction 
as to the value of H for the system at a later time < 2 - I’or tliis purpose 
we may take 

H{t^) = («2-<x) (49.4) 

as giving a reasonable prediction, provided the time inteival (^ 2 — ^i) is 
short enough to justify the neglect of higher order terms. As the 
average value of dHj^ is negative, we then predict that Hit^) will 
be less than H{tj), provided the difference in time («a— #i) is not too 
great. 

At the later time let us now consider that we actually make a new 
observation of molecular distribution, which will permit us to deter- 
mine the actual value of H{t^) for the system of interest at time t^. In 
accordance with the fact that the value of dHjdt inserted in (49.4) was 
only an estimate of what could be expected on the average, the actual 
value If (< 2 ) wiU presumably not agree precisely with the value which 
we have predicted. Nevertheless, it will almost certainly be less than 



160 


BOLTZMANN’S H-THEOREM 


Chap. VI 


the original value ^(^1)9 since the negative character of dHjdt is very 
highly probable. 

Starting with the new observation of molecular distribution, which 
has given us the actual value of at time ^2? R® then again make 
a prediction as to the value of dHjdt, Let us do this by now setting 
up a new representative ensemble, constructed so that all its members 
are in states that agree with our new observation of condition at time t^, 
but are otherwise again distributed in accordance with our postulate 
as to equal a priori probabilities for equal regions in the phase space. 
Since this new ensemble would evidently again lead to the prediction 
of a negative value for dH jdt, assuming that H has not yet reached its 
minimum, we could then proceed as before to the conclusion that H{t^) 
at a still later neighbouring time ^3 would almost certainly be less than 
if(^2) at time ^2- Continuing in this fashion, we are thus provided with 
arguments for regarding it as highly probable that successive observa- 
tions on the condition of the system of interest would be found to 
correspond to lower and lower values of H. 

Nevertheless, as already intimated above, this treatment may not 
seem entirely satisfactory. IVom the point of view of present interest, 
legitimate reasons for dissatisfaction could, of course, only lie in the 
possibility that the above application of the methods of statistical 
mechanics has not reaUy been carried out rigorously in the manner 
demanded by our postulatory basis. In accordance with that basis we 
are to draw conclusions as to the expected behaviour of a given system 
of interest, in a partially specified state, by considering the average 
behaviour in an appropriate representative ensemble of similar systems. 
To obtain such an appropriate ensemble we distribute the representative 
points for its members throughout the phase space, m the first place 
in such a maimer as to agree with what partial knowledge we do have 
as to the state of the system of interest, and in the second place^ — ^in 
accordance with our fundamental postulate as to equal a priori pro- 
babilities — ^in 8is uniform a manner as is consistent with that partial 
knowledge of state. We then take the average behaviour in this 
ensemble as giving a good prediction for the behaviour of the actual 
system. 

liooking back now on the foregoing attempt to apply these methods, 
it will be agreed that our first application, at the initial time was 
correctly made, since we .then coMidered the behaviour of a repre- 
sentative ensemble which was constructed to agree with all that we 
knew about the state of the system, namely the results given by our 



§49 


CONTINUED DECREASE OF E 


151 


initial observation of moleoular distribution at time It may be con- 
tended, however, that our second application of the prescribed methods 
at the later time #2 'was not so correctly made, since we then considered 
a representative ensemble which was constructed to agree with the 
results of our new observation at that time, but not iu such a maimer 
as to make any allowance for the additional circumstance that the 
system was known to have passed m the time interval to from 
the actually observed initial to the actually observed later condition. 
Similar remarks would, of course, also apply at times later than fj- 
In principle it would be possible to overcome this difficulty, en- 
countered in applying the methods of statistical mechanics at a time 
when our knowledge of the instantaneous condition of a system is 
supplemented by knowledge as to previous behaviour, by constructing 
a representative ensemble so distributed as to agree both with the 
instantaneous condition and with the past behaviour of the system of 
actual interest, and otherwise left uniform in agreement with the funda- 
mental postulate as to equal a priori probabilities. Thus, returning to 
the above example, our second application of the methods of statistical 
mechanics at time would seem thrown into a justifiable form if we 
should eliminate from the ensemble representing the known condition 
of the system at time aH those members which could not have come 
in the interval to % from the condition known to prevail at the earlier 
time, and should then use this appropriately modified ensemble to draw 
conclusions as to the expected value of dHjdt for the system of interest. 
The rigorous application of this corrected procedure would not be 
simple, however, nor perhaps indeed fall within the range of treatments 
for which statistical methods are specially suited, since the elimination 
of systems which could not pass in the time from states corre- 

sponding to the original to those corresponding to the later condition 
could only be carried out with the help of an elaborate investigation 
of the precise mechanical behaviour of the systems composing the 
original representative ensemble at time Hence we shall now con- 
tent ourselves with the statement that it seems reasonable to believe, 
in cases where the initial and later observations of the conditions at 
times and are not sufficiently exact to lead to a close specification 
of precise state, that the states to be eliminated from the second repre- 
sentative ensemble — as not agreeing with the known time of passage — 
could usually be regarded as scattered among the states corresponding 
to the condition at time in such a random manner as not to disturb 
the previous conclusion that the average value of dH jdt for the members 



162 


BOLTZMANN’S H-THBORBM 


Chap. VI 


of the ensemble would be a negative quantity, "j" If this be accepted, 
we could then maintain the conclusion that any actual isolated system 
of interest could usually be expected to exhibit lower and lower values 
of H on its way towards the lowest possible value, at least until the 
successive observations of conditions and tunes of passage could be 
used in principle for such a precise determination of state that the use 
of a representative ensemble corresponding merely to the latest observa- 
tion of condition became clearly inappropriate. 

This demonstration of the probability for a continued decrease of H 
with timft is therefore now clearly seen to lack rigour and to apply to 
what may be called typical cases. Indeed one can readily invent 
where a knowledge of the past behaviour of a system would 
be highly relevant to the future changes to be expected in H. We shall 
later supplement the above considerations as to the long time behaviour 
of if by a different method of treatment based on a generalization of 
the If-theorem. Note in particular the remarks at the end of § 61. 

(c) jff-theorem and the principle of dynamical reversibility. We now 
turn to the treatment of an apparent inconsistency between exact and 
statistical mechanics which originally seemed very puzzling. On the 
one hand, in accordance with the if-theorem, as discussed in the present 
chapter, it is evident that the methods of statistical mechanics naake 
it necessary to expect a preferential tendency for isolated s3rstems to 
change in the direction of smaller values of H, provided the lowest 
possible value of that quantity has not been attained. On the other 
hand, in accordance with the principle of d3mamical reversibility, as 
discussed in the preceding chapter, it is well known that the laws of 
exact mechanics make it equally possible for any internal behaviour of 
an isolated system and the reverse of that behaviour to take place. 
Hence at first sight, as originally suggested by Loschmidt,{ some possi- 
bility for confiict between the conclusions drawn firom statistical and 
from exact mechanics might seem to be possible, since for any behaviour 
of a system such that H was decreasing with' time it would be equally 
possible to have another behaviour in which H was increasing with time. 

t Compare the similar point of view expressed by P. and T. Bhrenfest, Shtoykl. d. 
Math. Wiss. TV. 2, ii. Heft 6, p. 60, $ 18c. It may be mentioned in this connexion, 
nevertheless, that Burbury (Kinetic fTkeory of QaseSf Cambridge, 1899) was always of 
the opinion that the continued decrease of Boltzmann’s H as a result of collisions could 
not be maintainsd, since the collisions would themselves result in a correlation of 
velocities for neighbouiing molecules which would invalidate the use of the hypothesis 
of molecular chaos — i.e. of the postulate as to et^ual a priori probabilities — ^in calculating 
the eSect of further collisions. 

J Loschmidt, Wien. Ber, 73, 139 (1876), and 76, 67 (1877). 



§49 


DECREASE OF H AND REVERSIBILITY 


153 


To have a sharp example of this apparent possibility for conflict, let 
us consider a gas enclosed in an isolated container, which is itself 
divided into two equal compartments by a partition perforated with 
a small hole, and let us start out with the gas distributed quite unequally 
in amount between the two compartments. In accordance with the 
jff- theorem, we can then expect the gas to flow through the connecting 
hole from the high- to the low-pressure side of the partition, since the 
process of equalizing the distribution would lead — as we have seen in 
§ 48 (c) — ^to lower and lower values of H. In accordance with the prin- 
ciple of dynamical reversibility, nevertheless, it would be conceptually 
possible, by reversing all molecular velocities at any time during the 
outward flow, to obtain a natural mechanical motion of the system 
with the molecules of the gas retracing their paths from the low- 
pressure side back through the hole in the partition into the high- 
pressure side. Furthermore, it may be emphasized that the possibility 
for such reverse behaviour would exist for any possible precisely speci- 
fied motion by which the flow from the high- to the low-pressure side 
might be taking place. We may then state the apparent possibility for 
conflict, with the help of anthropomorphic language, in the form: Why 
does the gas prefer to obey the jff-theorem and flow from the high- to 
the low-pressure side, when the principle of dynamical reversibility 
provides it in every case with an equally vaM mechanical behaviour 
in the opposite direction? 

The resolution of the apparent difificulty lies in maintaining a clear 
distinction between the use of statistical mechanics to treat the be- 
haviour of systems in conditions that correspond to a range of different 
possible precise states, and the use of exact mechanics to treat the 
behaviour of systems in precisely defined mechanical states. With this 
distinction in mind we then see that there is really no opportunity for 
conflict between the iT-theorem and the idea of dynamical reversibility 
since the two principles apply to different observational situations. 
The i^-theorem is a principle of statistical mechanics which applies to 
a system which is only known to be in a condition that will correspond 
in typical cases to many different precise states; when this condition 
is very different from that of equilibrium the principle asserts that most 
of these states will be such that there is a high probability for the 
system to move towards conditions with a lower value of H. The idea 
of dynamical reversibility is a principle of exact mechanics which 
applies to a system known to be in some particular precise state and 
hence to be carrying out some particular precise motion; the principle 

3595.25 X 



1S4 


BOLTZMANN’S H-THBOEEM 


CShap.VI 


then asserts that it is entirely possible to specify another (reverse) 
precise state for the system such that the system would carry out the 
reverse of the first motion. The jET-theorem predicts a given direction 
as probable for spontaneous changes in the condition of a system known 
to be in one or another of a collection of different states. The principle 
of dynamical reversibility says that it would be conceivably possible, 
with an exact knowledge of state and adequate apparatus for the 
reversal of velocities, to secure a reversal for any precise motion of 
the system. The two principles thus apply to different situations and 
there is no occasion for conflict. 

The above discussion may be illustrated with the help of our pre- 
vious example of the gas in the container having two connected com- 
partments. The initial ^eoification that the gas is unevenly divided 
between the two equal compartments gives nowhere nearly enough 
information to define a precise state of the system, and by examining 
the states that could correspond to the condition described, making 
use of the principle of equal a priori probabilities for different equal 
extensions in the phase space, we see that the gas has an overwhelming 
probability of being in a state such that flow would take place from 
the high- to the low-pressure side. This conclusion is in no way con- 
tradicted by the fact, that for any precisely defined succession of 
mechanical states by which the equalizing flow might take place, it 
would be theoretically possible to set up a behaviour in the opposite 
direction. The reason for the flow from the high- to the low-pressure 
side may be said to lie in the circumstance that it is very easy to put 
the gas imequally in the two compartments in such a way that flow 
would take place in a direction to equalize the pressure, but that it 
would be very difficult to adjust the positions and velocities of the 
molecules so as to obtain a net transfer of molecules through the hole 
in the opposite direction from the low- to the high-pressure side. 

It may be regarded as a great achievement of Eoltzmann that we 
are now able to comprehend the compatibility between the reversibility, 
implied by the laws of exact mechanics, and those actual irreversibilities 
observed in mechanical behaviour, which must be treated by the 
methods of statistical medbanics. The compatibility depends on the 
different degrees of observational precision involved in mechanical 
situations which are appropriately treated by the methods of exact or 
of statistical mechanics. The reconciliation between reversibility and 
ureversibility, which has thus been obtained, may even be thought of 
as providing the appropriate reconciliation between the divergent points 



§ 49 DECREASE OF H AND REVERSIBILITY 156 

of view of Newtonian and of Aristotelian mechanics as to the natural 
state of a material body being one of uniform motion or of rest. 

(d) jff-theorem and the occurrence of continued fluctuations. We may 
now complete this long discussion of Boltzmann’s J?-theorem by con- 
sidering another apparent difficulty which was first pointed out by 
Zermelo.f As we have seen in the foregoing, we can expect to obtain 
an immediate verification of the tendency for H to decrease with time 
whenever we definitely set up an isolated system in a condition having 
a value of H much greater than its Tm'm'mnTn , since there will then be 
an overwhelming probability that the system will actually be in a pre- 
cise state from which changes in the direction of lower values of H must 
ensue. The situation is altered, however, if we desire to obtadn a veri- 
fication of the N-theorem by observing the long time behaviour of an 
isolated system, since it then turns out that we must regard tbia con- 
tinued behaviour as a succession of fluctuations in which the value of 
H will increase as often as it decreases. At first sight such a behaviour 
might seem to contradict the previous conclusion as to the tendency 
for H to decrease. 

The ireason for the conclTision that the long time behaviour of an 
isolated system must be such as to include increases as well as decreases 
in H may be based on a theorem of Poincar6.t In accordance with this 
theorem, an enclosed isolated mechanical system if left to itself would 
in general carry out a quasi-periodic sort of motion, so that the system 
would always return after a finite interval of time to pass through 
states at least in the very close neighbourhood of those through which 
it has previously passed. The derivation of the theorem depends, on 
the one hand, on the finite total extension of phase space that would 
be available to the phase points representing spatially confined and 
isolated sjrstems, and depends, on the other hand — in accordance 
with liouville’s theorem — on the constant extension which would 
be permanently needed for any group of phase points representing 
systems originally started ofr in neighbouring states. As a con- 
sequence of the theorem, we must conclude that any decreases in 
H which ta<ke place during the continued behaviour of such a system 
must be balanced in the course of time by corresponding increases, 
and vice versa. 

In this connexion it is also to be noted that our fundamental principle 
as to equal a priori probabilities has a bearing on the conclusion that 

t Zermelo, Wied, Ann. 57, 485 (1896). 

( Foinoar4, Acta Math. 13, 67 (1890). 



156 


BOLTZMANN’S H-THEOREM 


Chap. VI 


increases as well as decreases in H must take place in isolated systems. 
In accordance with this principle, if we select a system which is merely 
known to be isolated with an energy content, say, between E and 
E+hE, there will then be equal probabilities for finding the system, on 
precise examination, in states corresponding to any one of the equal 
regions into which we might divide the shell in the phase space lying 
between E and E+8E. But this means that an isolated system, chosen 
at random to use for an investigation of long time behaviour, is equally 
likely to be found in any state and in the reverse of that state, since 
such pairs of states are equally represented in the phase space. Hence, 
in TYia.Trirtg tests on the behaviour of isolated systems, the samples* that 
we use are equally likely to be carrying out any particular kind of 
motion and the reverse of that motion. This, too, makes it necessary 
to conclude that the long time behaviour of isolated systems can exhibit 
increases as well as decreases in H, in agreement with the above con- 
clusion that any one isolated system would exhibit a succession of 
balancing increases and decreases of H as time proceeds. 

We must now show that there really is no necessary conflict between 
this new conclusion, that an isolated system left to itself would exhibit 
increases and decreases of H with equal frequency, and our previous 
conclusion given by the H-theorem that the most probable behaviour 
for an isolated system with a value of H greater than the minimum 
would be for H then to decrease with time. The possibihty of combining 
these two conclusions without contradiction depends on two factors. 
In the first place it is to be noted that the H-theorem only makes an 
assertion as to the probable or average behaviour of H, and hence does 
not rule out the possibility for actual increases as well as decreases in 
that quantity to occur. In the second place it is then to be noted that 
a succession of increases and decreases of H with time is not incom- 
patible with a large probability for that quantity to decrease with time 
from any specified high value, provided that value of H usually stands 
at the peak of a trajectory from which changes back to lower values 
do take place. 

The combination of the H-theorem with the conclusion that H must 
rise and fall with equal frequency thus provides us with a good quali- 
tative picture of the long time behaviour of an isolated system, as con- 
sistiog of a succession of fluctuations in the value of H, up from the 
neighbourhood of the miniTmim and back again, with an almost certain 
immediate return downward from any high value of H which is 
attained. For example, if we take as the miniTYiuTn possible value 



§49 


FLUCTUATIONS OF E 


167 


and ffg, as values which are considerably higher in the order 
given, we may regard the succession of steps 


as taking place much more frequently than the successions 


J?3 -ffs 
Hi or 


(49.6) 


(49.6) 


and these again much more frequently than the succession 


Hi Hi 


(49.7) 


Under these circumstances a high value of H, such as in the present 
illustration, would almost certainly be attained as the top point of a 
fluctuation, and immediate return to lower values would result. We 
then see that increases and decreases of H could take place with equal 
frequency, and yet the conclusion still be maintained that the system 
would almost certainly move towards lower values when known to have 
a high value of H, such as JJg in the above illustration. 

This finally satisfactory elucidation of the problem raised by Zermelo 
is particularly due to P. and T. Ehrenfest,t who appreciated the pos- 
sibility of an ultimate analysis of the changes in H with time into a 
succession of steps brought about by the transfer of individual mole- 
cules from one cell = Sji ... in the ju-space to another. It is this 
analysis of the behaviour of H into discontinuous changes that makes 
it possible to xmderstand the sudden reversals that can occur at the 
peak of an upward fluctuation. 

Several remarks of interest may be made in coimexion with the 
fluctuations that we must now regard as taking place in enclosed 
isolated mechanical systems. 

In the first place it will be noticed that the successions of steps 
(49.6-7), which we have considered above, are entirely symmetrical 
with respect to the forward and backward directions of time, and hence 
that the immediate return to lower values of H from the high point of 
a fluctuation can be regarded as taking place with great probability 
equally well for any particular motion of a system and for the reverse 
of that motion. Again we see that the conclusions of statistical 


t P. and T. Ehxenfest, Encykl. d. Math. Wise, TV. 2, ii, Heft 6, pp. 41 fE. 



158 BOLTZMANN’S H-THEOBEM Chap. VI 

Dieolianios stand in no kind of conflict ■with the re'versibility of time 
implied by the fundamental laws of exact mechanics. 

It will be further noticed that the greater probability of occurrence 
which we ha've assigned to the do'wnward succession of s'teps gi'ven by 
(49.6), as compared with the reversal of steps given by (49.7), is in 
agreement 'wi'th the conclusion discussed in § 49 (6) that the decrease 
in H from an origmally high value, in this case can be regarded 
as con'tin'uing on with great probability down through lower values 
towards the mimmum. 

It is also of interest to consider the relative probabilities for fluctua- 
tions of different extents away from the minimum value of H. In 
accordance ■with the circumstance that the probable value of dH/dt 
approaches zero as we approach the minimum of H, we can regard 
small fluctuations away from the minimum as occurring "with great 
frequency. In the case of large upward fluctuations, however, not only 
does the probable value of dHIdt become more and more negative as 
we go to higher values of H, but also the succession of steps in which 
H must increase rather than decrease becomes longer and longer. 
Hence we may regard large fluctuations of H away from the minimum 
as occurring very infrequently in the case of an isolated system. In 
this connexion it is of interest to note the estimate of Boltzmannf that 
times enormously great compared with 10““ years would be needed 
before an appreciable separation would occur spontaneously in a 100 
cubic centimetre sample of two mixed gases. 

We thus bring to an end our long discussion of this famous H- 
theorem, which has outlived so many attacks and misunderstandings. 
We can only conclude ■with words of admiration for the genius of Boltz- 
mann. His fruitful discovery of a suitable function for measuring ■the 
displacement of a whole system of molecules from equilibrium, and 
his elegant mas^tery of ■the complica^ted effect of collisions in making 
such displacements decrease ■with time, alike compel attention. BBs 
reconciliation of phenomenological irreversibility with the reversible 
character of the la^ws of exact mechanics, and bia understanding 
of 'the compatibility of continued fluctuations ■with a tendency 
■towards equilibrium, are among the great achievements of theoretical 
physics. And his penetrating remarks on the great role that fluctua- 
■tions mig ht play in ■the long time behaviour of the universe as a 
whole, show Boltzmann’s preoccupation with the deepest problems of 
physios. 

t Boltemajm, VoHeatmgm Uber Cfaaffieorie, Leipzig, 1912, Part II, p. 264. 



( 169 ) 


50. if -theorem and the condition of equilibrium 
(a) Maxwell-Boltzmann distribution when if is a mininuun. We see 
from the foregoing that an enclosed isolated mechanical system, 
definitely started off with a large value of if, can be expected first to 
change in the direction of lower and lower values of if, and then to end 
up by carrying out a succession of fiuctuations which will rarely carry 
it to values of if much above the minimum for that quantity. We 
must now investigate the equilibrium condition of a system when if is 
at or near its minimum value. 

To study the approach towards equilibrium, let us consider an iso- 
lated system composed of n similar molecules, enclosed in a fixed con- 
tainer of volume v, and with a specified energy content lying in the 
narrow range H to H+SJS. And let us express if for this system, as 
in (47.7), in the form of the summation 

if = 2 log const., (50. 1) 

i 

where % gives the number of molecules in the ith equal cell into which 
we have divided the /x-space for the kind of molecules under considera- 
tion, and we take a sum over all possible cells i. The approach to 
equilibrium will now consist in the decrease of H towards the TwiTn'miim 
possible value allowed by the specified constitution, volume, and energy 
of the system. 

The equilibrium distribution of the molecules will hence be charac- 
terized by a minimum value of H, as determined by the variational 
equation Sif = 2 (log «< + 1 ) = 0, (50.2) 

i 

under the subsidiary conditions 

Sn = '^Bni = 0, (50.3) 

imposed by the fixed number of molecules, and 

= = 0, (50.4) 

imposed by the fact that the mechanisms responsible for the change 
of H cannot alter the total energy of the system. We immediately see, 
however, that these equations have exactly the same consequences as 
our earlier equations of § 29, requiring a conditional maximum for logP 
instead of a conditional mmimnni for H. Hence we shall thus find a 
complete agreement between our present treatment of equilibrium, 
from the point of view of the behaviour of H with time, and our 



160 


BOLTZMANN’S N-THEOKEM 


Chap. VI 


previous treatment, from the point of view of the most probable con- 
dition for the systems in an appropriate microcanonical ensemble. 

In accordance with this finding, our previous solution (29.6) for the 
distribution of molecules at equilibrium, 

(60.6) 

where a and j8 are undetermined constants, will also satisfy our new 
equations (50.2), (60.3), and (60.4). Furthermore, this solution may 
also be rewritten as before in’ the more familiar form 


= nCe-^I^^Sqi ... Sp„ (60.6) 

where Sn is the number of molecules with coordinates and momenta in 
the range ... Sp^, e is the energy per molecule introduced into that 
range, C and k are constants, and T is the temperature of the system. 
We are thus once more led to the MaxweU-Boltzmann distribution law 
for the molecules of a system that has come to equilibrium. 

(6) Steady condition when J? is a minimum. It is also of interest to 
show that this MaxweU-Boltzmann distribution, prevailing when H is 
a minimum, satisfies the conditions, previously considered in § 48, for 
a probable cessation of the changes in H with time. This wiU then 
agree with our ideas that an isolated system, which once arrives at the 
condition of minimum H, wiU thereafter tend to remain in the nrigh- 
bourhood of that condition. 

Comparing the MaxweU-Boltzmann distribution (60.6) with our pre- 
vious general expression (47.1) for the number of molecules 


Sn = /(gi ... t) 8gi ... Sp, (50.7) 

in the specified range Sg^ ... Sp^, we see that our general distribution 
function / wiU assume the special form 


fill —Pr) = 


(60.8) 


when the MaxweU-Boltzmann distribution, prevailing with H a mini- 
mum, has been attained. We may now consider several examples 
showing that this value of the distribution function / does satisfy 
our previous conditions for the probable value of dHjdt to go to zero. 


In the case of bi/molecvlar collisions i 
we can write 


in accordance with 


fi = nCe-^^, 
h = nCe~^^, 


fi = nCe-^l^^, 
ft = nCe-^^ 


(60.8), 

(60.9) 


as the values of the distribution function f that would apply to mole- 



§50 


i?-THEOREM AND EQUILIBRIUM 


161 


cules in the states j and 1c, I jfrom and to which such a collision leads; 

and, in accordance with the principle of the conservation of energy, we 

can write > ■ /t^A ia\ 

(60.10) 

as a connexion between the energies of the molecules before and after 
collision. By combining (50.9) with (60.10), we are then led for any 


collision of the form 




to the result 
fifi = fkfl' 


(50.11) 


This, however, is our previous condition (48,18) for the probable value 
of dHfdt to go to zero, in so fax as it depends on the effect of bimoleoulax 
collisions. Similarly, in the case of trimolecvJar collisions we shall be 
led from (50.8) to results of the form 


fifjfk = flfmfn 


(50,12) 


for a collision initiated and completed by molecules in the states i, j, h 
and l,m,n\ and this is our previous condition (48.20) for the probable 
value of dHjdt to go to zero, in so far as it depends on trimolecular 
collisions. Again, in the case of a traTisport of molecules between cells 
in the jn-space, say Sv^Sco and which correspond to the same 

energy but to different spatial locations, we are led by (60.8) to results 
of the form 13 ^ 


and this, as discussed in § 48 (c), is the condition for the probable rate 
of transport and corresponding rate of change in jff to go to zero. We 
thus appreciate in general that we are justified in r^arding the mini- 
mum of H as corresponding to a steady condition of an isolated system 
aroimd which, however, fluctuations can take place. 

(c) Detailed balance when IT is a minimum. In accordance with the 
above, the Maxwell-Boltzmann distribution of molecules, which is 
attained when the quantity H for an isolated system reaches its mini- 
mum value, has a character such that the further action of collisions 
or of other internal molecular processes can be expected to leave the 
overall condition of the system unchanged. In order to maintain such 
a steady condition it is evident that the rates of different molecular 
processes, by which we can regard the molecules as being transferred 
between different regions in the /i-space, must be related to each other 
in such a manner that the resulting deflcits and excesses in the mole- 
cular populations of the different regions involved are kept balanced 
out. Thus, for example, it is evident from the foregoing discussion, in 

SS95.25 V 



162 


BOLTZMANN’S H-THEORBM 


Chap. VI 


the case of bimolecular collisions in a homogeneous gas, that the 
probable rates for the different collisions 


2,1\ /4,3\ /6,5\ 1\ 

3,4/ ^6,6/ ^7,8/ ■■■ i 1,2 / 


(60.14) 


in any closed cycle of correspoTndi'ng collisions, assume such values, when 
the Maxwell-Boltzmann distribution is attained, that there will be a 
balance between the rates at which these collisions are adding and 
subtracting molecules from the pairs of regions involved. Furthermore, 
in the specially simple case of spJifricol molecfides, where the cycle of 
corresponding collisions reduces to a pair of inverse collisions of the form 


3 A m 
jcj) [ijJ 


(50.15) 


it is evident that we can regard the effect of each such collision as 
directly compensated by the effect of the inverse collision. 

It will now be of interest to consider a general method, when H is 
a TYtiniTmiTTi j of Setting up a specially simple kind of balance, between 
the effects of different molecxilar processes, which is often important in 
giving an insight into the treatment of physical chemical problems. To 
obtain such a balance we cannot, of course, in general correlate each 
molecular process with an inverse process, and thus obtain a direct 
compensation of effects as in the above-mentioned simple case of colli- 
sions between spherical molecules, since in general, as we have seen in 
§ 42, processes which are the inverse of each other do not exist. Never- 
theless, it does prove possible to set up a useful balance by correlating 
any given molecular process with the reverse process thereto, which, as 
we know from the principle of dynamical reversibility, must necessarily 
be capable of existence. 

To investigate this we note, in accordance with the Maxwell-Boltz- 
mann distribution law (50.6), that the numbers of molecules, assigned 
/as equilibrium quotas to different equal infinitesimal regions in the 
ja-space, depend_solely on the energies e corresponding to those regions. 
Thus, since the energy of a molecule is imaltered by a mere reversal 
of velocities, it is evident that the probability of finding a molecule at 
equilibrium in any specified range i of states in the /i-space is equal 
to the probability of finding a molecule in the range — i composed of 
the reverse states thereto. As a consequence we can also assign equal 
values at equihbrium to the probabilities of finding any constellation 
of interacting molecules in specified ranges of the /t-space i, and of 
fi nding the reverse constellation of molecules in the ranges — i, — j ,,.. . 



§50 


MICROSCOPIC REVERSIBILITY 


163 


Hence we can now conclude, under equilibrium conditions, tbat any 
molecular process and the reverse of that process will be taking place 
on the average at the same rate, provided, of course, that we use equal 
ranges in the jLt-space in defining the two processes. This result may 
be called the principle of microscopic reversibility at equilibrium. if 
As an illustration of this principle we may consider, in the case of 
a homogeneous gas which has come to equilibrium, the two reverse pro- 


cesses which are provided by the reverse collisions ^ and 


~3j 

as defined in Chapter V. It is then immediately evident, in agreement 
with the above principle, that we can equate the probable rates of 
t hese tw o_c^^ns, (50.16) 

since we at once appreciate that we can expect, at any time under 
equilibrium conditions, to find the same number of pairs of molecules 
present in the gas in the constellation {j, i), with which the first of these 
collisions is beginning, and present in the constellation (—j, —i), with 
which the second of the collisions is ending. It may further be noted 
that the above equality could also be obtained from our previous 
general expressions (46.14) for the rate of these two collisions, provided 
we substitute therein the consequences of the Maxwell-Boltzmann dis- 
tribution and of the conservation of enei^ for any collision. 

As a more general formulation of the principle of microscopic re- 
versibility at equilibrium, it will be convenient to employ the equation 


(60.17) 

where is the average.mte-of. -occurrence, under equilibrium 

conditions of any specifieji, process by which molecules are transferred 
between regions in the /i-space denoted by and by and 

is the average rate of the reverse process, where the 
regions denoted by — i, — 3 ,... and —h, —I,... consist of the reverse 
states to those previously included. Using appropriately specified 
regions in the ju-space, this equation can be regarded as holding at 
equilibrium for molecular processes of any degree of complexity. 

With the help of this general expression of the principle of micro- 
scopic reversibility at equilibrium we may now also obtain a useful 


t The phrase, ‘principle of microscopic reversibility’, was first suggested by the 
writer, Phys. Bev. 23, 699 (1924). See also Tolman, Proc, Nat. Acad. 11, 436 (1926), 
and Statistical Mechanics, New York, 1927, § 204. The phrase is not very appropriately 
descriptive. It might be better to replace it by the phrase, ‘principle of equal frequency 
for reverse molecular processes at equilibrium’. The principle should in any case be 
distinguished from any statement as to equal &equenoies for inverse molecular processes, 
since these, in general, do not exist. 



164 


BOLTZMANN’S H-THEOREM 


Chap. VI 


expression for the total rates at which molecules would be transferred, 
by the combined action of all different processes, between any one pair 
of regions in the jit-space and between the pair of regions containing the 
corresponding reverse states. For the total rates at which molecules 
would be transferred at equihbrium, say from regions i to h in. the 
fc-space and in the reverse direction from —h to — we may evidently 


write 

and 




(60.18) 


■where the smrimations are taken over all possible processes which result 
in the transfer of a molecule from iio k and from —k to —i in the 
two cases respectively. It 'will be noted, however, that the two summa- 
■tions are made up of rates applying to individual processes that are 
reverse to each other. And hence, in accordance with (60.17), we can 
now •write the equality, 

= (50.19) 

between the total average rates -with which molecules would be trans- 
ferred at equilibrium from regions i to A in the ja-space and from regions 


— k 'to — i. 


The relation given by (60.19) can now be used to show the possibility 
mentioned above of setting up a simple kind of balance between the 
effects of different molecular processes. In addition to processes by 
which molecules are transferred from ito k and from —k to —i, pro- 
cesses may also be possible in which molecules are transferred in other 
ways between those four conditions. The total rates of transfer will in 
any ease, however, be subject to relations of the general form (60.19), so 
•that we can now evidently write 




N. 








(60.20) 


^-ir*~k — ^k^ 

as relations connecting the total rates for all possible kinds of transfer 
between the four specified regions in the //.-space. Adding these four 
equations together we then obtain the desired result, which can be 
written in the 8ym^)olie form 


^atr^k ~ ^±k-*±i’ (60.21) 

where the left-hand side dg,notes the t gtal ra te of transfer of molecules 
^__6qm]ibrium from either of 'the regions i .or. -rri in the ~ii-sp%c6 'to 



§60 


DETAILED BALANCE 


166 


either of the regions h or — and the right-hand side denotes the rate 
from regions h or —h back to i or — i. 

The result expressed by this equation may be called the principle of 
detailed balance at equilibrium.^ In accordance with this principle we 
can now regard the steady condition, corresponding to the minimum 
value of H for an isolated system, as maintained in such a way that 
the number of molecules which pass per unit time from any pair of 
regions in the /i-space — containing states that are reverse to each 
other — ^to any other similar pair can be balanced against the 
number which pass per unit time from to Since we are usually 
not interested in maintaining any distinction between reverse molecular 
states, this result often proves very useful when the direct calculation 
turns out to be much simpler for one of the two related rates of transi- 
tion than for the other. This is illustrated in the weE-known method 
of drawing conclusions as to rates of chemical activation from the much 
more simply calculated rates of chemical deactivation. 

51. The generalized j5'-theorem 

(a) Definition of the quantity B. for an ensemble. In the preceding 
parts of this chapter we have given consideration to a method for 
treating the approach of an isolated system of molecules towards its 
final equihbrium distribution which was originaUy discovered by Boltz- 
mann. We shaU now bring the chapter to a close by turning our 
attention to a more general method for treating the same problem, 
which was subsequently devised by Gibbs. 

To carry out the Boltzmann method of treatment we first defined 
a quantity H, which directly characterizes the condition of any system 
of interest by having a value that depends at any instant on the dis- 
tribution of the molecules of that system among their different possible 
states. We next introduced the methods of statistical mechanics by 
showing, for an ensemble of systems, aU members having the same 
condition as that of the actual system of interest, that the average 
(most probable) value of dHjdt would be negative for the members of 
the ensemble. We then drew the conclusion that the value of H, for the 

t The phrase * principle of detailed balance’ is due to the frequent consideration given 
to the * detailed balancing’ of elementary processes by Fowler; see, for example, his 
Statistical Mechanics, Cambridge, 1929. The phrase is an excellent one. The difficulty 
mentioned by Fowler, PM. Mag. 47, 264 (1924), as to the necessity of ruling out 
cyclical molecular processes as a mechanism of balance in order to maintain the validity 
of the principle, is met above and in Statistical Mechanics, New York, 1927, § 204, by 
the device of grouping each molecular state with its reverse in considering the mtes of 
transfer between different molecular conditions. 



166 


BOLTZMANN’S IT-THEOREM 


Chap. VI 


system of interest itself, could be expected to decrease with time 
towards its minimum equilibrium value. 

To carry out the Gibbs method of treatment we shall introduce the 
ideas of statistical mechanics at the very start by defining a new 
quantity H., which characterizes the condition of the ensemble of 
systems which we use as appropriate for representing the continued 
behaviour of the actual system of interest. This new quantity will have 
a value that depends at any instant on the distribution of the members 
of the ftpsfiTtihlfl n.Tnong their different states. We shall show that the 
quantity M for such a representative ensemble can be expected to 
decrease with time towards a final minimum value, requiring a uniform 
distribution in the phase space for the members of the ensemble lying 
in any energy range E to E-\-hE. We shall then draw the conclusion 
that the system of interest itself— if originally stai*ted off in a specified 
energy range E to E+SE — could be expected to approach a final 
equilibrium condition that would be represented by such a final micro- 
canonical distribution. As we shall see in what follows, especially when 
we come to the quantum mechanics, the Gibbs point of view is really 
more appropriate and powerful than that of Boltzmann. _ 

Before proceeding to the definition of our new quantity H for an 
ensemble of systems, it wifi, be necessary to introduce a distinction be- 
tween two different quantities for an ensemble, which may be called 
its fine-grained density and its coarse-grained density of distribution 
in the phase space.! 

In accordance with the methods of Chapter III, we can regard the 
precise state of an ensemble as specified, at any instant, by the density 
PiSi —Pf) 'w^hioh the phase points for its members are distributed 
in the phase space that corresponds to the 2/ coordinates and momenta 
...py for the kind of system under consideration. Taking N as the 
total constant number of systems in the ensemble, and assuming 
Tuyrmalization to unity, as now proves convenient, we can then write 

dN 

-Y = - ^Pf 

for the probability of finding a member of the ensemble in the infini- 
tesimal range dq^ ... dpf at the point q^—Pf in the phase space, and 
can write 

J - / Piii - Pf) dqx ... dpf = l (51.2) 

t The dear-cut distinction between these two quantities is due to P. and T. Ehrenfest, 
Eneykl. d. MaOi. Wisa. IV. 2, ii. Heft 6, p. 60, § 23. 



§51 


DEFINITION OF H 


167 


as the result of integration over aU possible values of the j’s and p*s. 
The quantity p{q ^ ... p^) appearing in these expressions may be spoken of 
as the fine-grained density of distribution in the phase space, since — in 
agreement with the assumptions of Chapter III as to a large enough 
total population N for the ensemble — we can regard (51.1) as giving 
a precise expression for the probability of fintlirig a member in the 
specified range as we let dq^ ... dpf approach zero as nearly as we desire. 

In making any actual measurement of the coordinates and momenta 
of a system, however, it is evident that we ordinarily do not achieve 
the precise knowledge of their values theoretically permitted by the 
classical mechanics. For this reason we shall also be interested in 
the probability of finding members of an ensemble within small but 
finite regions having extensions ... hpf, which correspond to the 
limits of accuracy actually available to us. For this purpose we may 
then define another kind of density P (this can be read capital rho if 
desired) at any point of the phase space by the equation 


/ - / P - dp, 

8qi ... 8pf 


(51.3) 


where the integration is taken over such a small finite range Sji ... 8pf. 
The new quantity P{qi is thus a mean of the fine-grained densities 
in the neighbourhood of the point ... p^, and may itself be spoken of 
as the coarse-grained density at that point. 

In accordance with this definition of coarse-grained density, we can 
now write 

= ■?(?! - Pf) S?i - % (51.4) 


for the probability of finding a member of the ensemble ha the specified 
small but finite region denoted by Sg^ ... Zp,. Furthermore, by regarding 
the whole phase space as divided up into a coUection of such regions, 
aU of the same extent Sg^ ... hp,, we can evidently write the equation 

2P„Sgi...8p, = l, (61.5) 

m 

where is the coarse-grained density for the mth such region, and we 
take a summation over aU such regions m. Or, replacing the summation 
by an integration over the whole of phase space, we can also write this 
rdlation in a form similar to (51.2) 

J - J ■?(?! - Pf) dqi - dp, = 1, (61.6) 

where we still regard the coarse-grained density P at any point as an 



BOLTZMANN’S N-THEOBEM 


168 


Chap. VI 


appropriate mean of the fine-grained densities p in the neighbourhood 
of that point. 

With the help of_the coarse-grained density P we may now define 
the new quantity S for any ensemble under consideration by the 
summation E = (61.7) 

m 

taken over all the regions m of equal extent into which we regard the 
phase space as divided. Or, again replacing summation by integration 
over the whole of phase space, we can also write the defining expression 
in the formf _ 

.ff = J ... J PlogPdji ... dpf, (51.8) 


provided we still maintain our imderstanding of the coarse-grained 
density P at any point as being a mean of the fine-grained densities 
p taken over a small but finite extension ... hpf in the^hase space 
in the neighbourhood of that point. It is clear that the E so defined 
depends in an essential manner on the size and character of the regions 
... hpf which we introduce as corresponding to the observations 
contemplated. 

In connexion with this definition we may emphasize the distinction 
between the two different quantities 

J ... J plogp dji ... dpf and J ... J PlogP ... dpf, (61.9) 

the latter being the one by which E is actually defined. It may be 
noted, however, that we could also write 


= J...JplogPdg'i...<^, (51.10) 

since logP would be constant over each one of the small regions 
Sfi ... that were introduced above, and the integration of p over 
such a r^on would give us P ... 8p/._In accordance with our defini- 
tion we see that our new quantity E may be regarded as the mean 
value of log P for the ensemble as a whole. The symbolism that we 
have adopted for the new quantity thus agrees with our general con- 
vention of using the double bar = to indicate a mean value for an 
ensemble as a whole. 

(6) Two necessary lemmas. Two lemmas may now be presented 
wh^ will be needed for the desired investigation of the rate of change 
in E with time. 


t is the ^ of Gibbs, Elementary PHwiplee in Sfatiatical Medumies, Yale Univer- 
sity 1902, except for added emphasis on the necessity of using the coarse-grained 

density P rather than the fine-grained density p in order to obtain a suitable quantity. 



§51 


TWO LEMMAS 


169 


As the first of these lemmas we have the fact that the quantity 
J ... J plogp dq^ ... dpf, defined for an ensemble in terms of the fine- 
grained density p, must actually have a value which remains constant 
independent of the time. The validity of this lemma depends on Liou- 
ville’s theorem, in the form of the principle of the conservation of 
density in phase (19.11), which gives us 

1=0 (H.ll) 

for the fine-grained density in the neighbourhood of any moving phase 
point. This lets us write 


if j ^ ® 

for the rate of change with time of the whole integral, since this is 
taken so as to include all phase points. We hence obtain an equality, 

J - / PilogPi - <^/ = J / P2logp2 - %>/. (61.12) 

between the values of the integral at any two successive times and 
^ 2 - The result given by (SI. 12), however, does not, of course, necessitate 
a constant value for the integral of primary interest, 

J ... J PlogP dji 

since the two quantities p and P will not in general be equal. 

As the second lemma to be presented we have the fact that the two 
quantities p and P can be combined in such a way as to give a quantity 
Q which must itself be essentially positive in character without refer- 
ence to the values of p and of P. This lemma has the form 


<2 = plogp-/)logP-p-HP>0, (61.13) 


and holds for any pair of quantities p and P which cannot themselves 
assume negative values, as is true in the present case on account of 
the interpretation of p and P as probabilities. To see the validity 
of (61.13) we note for any fixed value of P that we shall have 

Q = 0 when p = P 


and 


dQ _ -I p f > 0 when p > P 
dp ~ °®P ( < 0 when p < P. 


Hence the equality sign in (61.13) will hold only when p equals P, and 
the quantity Q will become increasingly positive as p becomes either 
greater or less than P. 

3896.25 


z 



no BOLTZMANN’S H-THEOREM Chap. VI 

_ (c) Cheuige in S with time. We are now ready to study the change in 
S with time for an ensemble representing some actual system whose 
temporal behaviour we wish to predict. For this purpose let us con- 
sider that we ma<ke an appropriate approximate observation on the 
state of the system at some initial time ti, and then set up an ensemble 
to represent the initial condition that we have foimd. In accordance 
with the methods of statistical mechanics and our fundamental postu- 
late as to equal a priori probabilitieSj this initial state of the ensemble 
can be obtained by taking uniform distributions of phase points through- 
out those regions in the phase space that do correspond to otur partial 
knowledge of the state of the actual system. These regions will have 
small but, nevertheless, finite extensions ... in the phase space 
on account of the limited sioouraoy of our observation. Since the fine- 
grained density p will have a constant value inside each such region, 
it will also be equal to the coarse-grained density P for that same 
r^on. Hence, at the initial time ti, we shall have 

Pi = Pi (51.14) 

at aU points in the phase space. And, in accordance with our 
(51.8), we can then write for the initial value of the quantity E that 
interests us _ 

S'! = J ... J pilogpi dQi ... (51.15) 

owing to the equality of p and P at the ini tia,] time. 

In the special case, in which the initial condition of the system is fomid 
to be such as to be represented by a uniform distribution throughout 
the whole of each shell in the phase space that lies within any specified 
energy range E to for example by a canonical or micro- 

canonical ensemble, the distribution in the phase space will remain 
unaltered as time proceeds (see §21), and p and P wiU continue to 
remain permanently equal to each other at all points in the phase space. 
This is the case when the system is already in its equilibrium condition 
at the initial time ^i, and its properties may then be investigated by 
the methods previously employed in Chapter IV. In the general case 
that now interests us, however, the initial condition of the system will 
not be such as to correspond to a uniform density p throughout the 
whole of each shell E to E-j-8E, and the distribution of the ensemble 
will change as time proceeds. Under these circumstances we can then 
no longer expect the initial equality between the two densities p and P 
at time ^ to persist at all points in the phase space as we go to later 
times. 



§61 


CHANGE IN E WITH TIME 


171 


To investigate this we note, in accordance with liouville’s theorem, 
in the form of the conservation of extension in phase (19.15), that the 
phase points initially present with the density /> in any one of the small 
regions ... into which we have regarded the phase space as 
divided, will in the course of time continue to occupy a total extension 
of unchanged magnitude ... provided we fix our attention on 
the precise boundaries of that extension. Nevertheless, this extension 
cannot be expected at later times to retain its original configuration or 
‘shape’, since phase points in different parts of the original small but 
finite region ... hpf will correspond to systems that carry out quite 
different motions. Indeed, as time continues, we must expect the 
boundaries of such an extension tq assume a very complicated ‘shape’, 
so that the phase points contained therein will ultimately be distributed 
in ‘filaments’, of the original density p, throughout many regions 
Sji ... of the kind into which we have divided the whole phase space. 
Thus, at any time ^2 later than the initial time we must expect to 
find regions Sji ... Sp^ inside of which the fine-grained density p will 
have a variety of different values, corresponding to the densities of the 
original regions that have contributed phase points to the region now 
in question, including, of course, the value zero when the original 
density at time was zero in certain regions that might otherwise have 
made a contribution. 

Hence, at the later time t^, we shall in general have points in the 
phase space where the fine-grained density p and the coarse-grained 
density P will not be equal. 

Pi ¥= Pi, (61.16) 

since the latter will be a mean of a variety of different values of the 
former. And we shall now have to write, at this later time, 

f 2 = / ... / PalogPa dgi ... dpf, (61.17) 

for the value of the quantity H that interests us, since we can no longer 
substitute p in place of P as at the initial time. _ 

We are now ready to compare the values of H at the initial time 
ti and at the later time Subtracting (61.17) from (61.16), we may 
begin by writing 

“ J — J (j“i log Pi—^2 logo’s) ... (61.18) 

This equation may be changed, however, so as to give a more informing 
expression. In accordance with the first of our previous lemmas as 
given by (51.12), we can replace Pilogpi in the integrand by Palogp 2 , 



172 


BOLTZMANN’S H-THEOREM 


Chap. VI 


since the integral of p log p over the whole of the phase space has been 
shown to have a value that does not change with time. Furthermore, 
in accordance with the possibility of expressing the value of in either 
of the forms (61.8) or (51.10), we see that we can replace PglogPg in 
the integrand by Finally, in accordance with (51.2) and (51.6), 

it is evident that we can add the two terms — and +P 2 to the 
integrand, since the effects of the two additions will cancel out on 
integration. Making these changes, we can then rewrite (51.18) in the 
form 

§2 = J ... J (palogpa— Palog-Pa— P2+-P2) (51.19) 

In accordance with the second of our previous lemmas, however, as 
given by (51.13), we see that our int^and now consists of a quantity 
which will be zero at any point in the phase space, where the fine- 
grained and coarse-grained densities pa Pz equal, but will be 
greater than zero at all those points discussed above, where the two 
densiti^ are no longer equal. Hence we now obtain the desired result 

^2 < (51.20) 

showing that the value of S for the ensemble will be less at the later 
time than at the initial time This result may be called the 
generalized H-fheorem. It is clear firom the method ^f derivation that 
the asymmetry between the earlier and later values and E^ is essen- 
tially dependent on the asymmetry between the expressions pi = ij, 
and p 2 ¥^Pz, which corresponds to a decrease with time in the definite 
character of our information as to the condition of the system of 
interest. 

We thus find that we have indeed defined for our representative 
ensemble of systems a quantity 

E= J ... J PlogPdg'i ... dpf, (51.21) 

which as time proceeds will change to values lower than that which it 
has initially when the ensemble is set up to represent some system of 
interest in a known observationally determined condition. Further- 
more, since this change with time depends on a development of in- 
equalities, at different points in the phase space, between the originally 
equal values of p and P, we can expect a continued decrease in j? as 
time proceeds and more and more such inequalities develop. 

Moreover, realizing that the constancy of energy will keep the total 
number of phase points in each energy range E to P-fSP constant, 



§51 


CHANGE IN E WITH TIME 


173 


and taking the ensemble as in any case only containing members which 
have the same values (e.g. zero) as the system of int^est itself for such 
simple constants of the motion as the components of linear and angular 
momentum, it seems reasonable to assume in typical cases that the 
changes in distribution considered above would continue imtU each 
small but finite region ... in any energy range E to E-\-hE would 
contain approximately the same proportionate contribution of phase 
points from each of the regions in that range which was originally 
populated. In drawing this conclusion we regard the restrictions im- 
posed by less simple constants of the motion as leading to a scattered 
exclusion of phase points from precise positions within the various small 
but finite regions Sgj ... hpf in a manner which does not interest us. 
Hence, in t 3 rpioal cases, we may take the totribution of the ensemble 
as continuing to change and the value of as continuing to decrease, 
until we arrive at an vltimate approximatdy uniform distribution of 
coarse-grained probability 

P = const. {E to E+SE) (61.22) 

within each energy range E to E-\-BE which originally contained 
representative points. 

When this uniform distribution has b^n reached, farther decrease 
in H would no longer be possible, since S will have then reached its 
minimum possible value. This is seen from the consideration that eqim- 
tion (51.22) is the solution of the condition for a minimum value of B, 

SB = f ... J (log P-1-1) 8P dg'i ...dp, = 0, (51.23) 

under the necessary subsidiary restriction 

E+hE 

J ... J 8P dqi ... d^f = 0 (61.24) 

E 

applying in each energy range. On once arriving at such a distribution, 
we can expect the approximate uniformity to persist over exceedingly , 
long intervals of time. 

In case our original knowledge of the condition of a system of interest 
is such as to confine it to a single energy range E to E-{-SE, we see 
that the ultimate condition of its representative ensemble would be an 
approximately uniform density of distribution within that energy range. 
Hence, in accordance with the methods of statistical mechanics, we 
may now conclude that any actual system of interest when left to itself 
with a specified value of energy can be expected to arrive in an ultimate 
condition of equilibrium, such that its properties can then be correlated. 



174 


BOLTZMANN’S H-THEOBEM 


Chap. VI 


as in Chapter IV, with the average properties in a corresponding uni- 
form microcanonical ensemble. This gives us a very satisfactory justi- 
fication for our previous treatment of equilibrium. 

In case our knowledge of the system of interest is such as to be re- 
presented by a distribution over various energy ranges E to E~\-hE, we 
see that the ultimate condition of the ensemble would be such as still to 
preserve the same relative probabilities for different values of the energy. 

The above conclusions apply, of course, to the case of perfect isola- 
tion where there is no opportunity for the system of interest to inter- 
change energy with its surroimdings. In a later place, §§111 and 112, 
we shall discuss the long time behaviour of ensembles representing 
systems in contact with their surroundings, and show under appro- 
priate circumstances that the condition of equilibrium can then be best 
represented by a canonical distribution with respect to energy. 

(d) Relation between the two forms of iT-theorem. We may next 
consider the relation of the original form of the J?-theorem, which gives 
the temporal behaviour of a quantity H defined so as to depend on the 
condition of a single system, to the above generalized form of the H- 
theorem, which gives the temporal behaviour of a quantity S defined 
so as to depend on the condition of the corresponding representative 
ensemble of systems. 

For this purpose, let us consider a system composed of n similar 
molecules, and let us regard the /x-space, corresponding to the 2?* 
coordinates and momenta ... for the kind of molecule involved, as 
divided into small but finite ceils, all having equal extensions 

Sv^ = Sq^...Sp, (61.26) 

in the /i-space. The condition of such a system can then be specified 
by giving the numbers of molecules n^, Tig, Tig,..., having values of 
their coordinates and momenta l 3 dng in the different cells i; and the 
value of the quantity S', corresponding to this condition, will be given, 
in accordance with (47.7), by 

S = 2%log7i^-f const., (51.26) 

i 

where we take a summation over all cells i in the ja-space. 

Let us next consider an ensemble of such systems, and let us regard 
the y-space, corresponding to the 2/ = (2r)» coordinates and momenta 
& ••• Pf for the n molecules of a system, as divided into small but finite 
regions, all having equal extensions 

8Vy = (Sji ... (Sji ... 8p,)a ... (Sji ... (51.27) 



§61 


RELATION OE TWO FORMS OF jff-THEOREM 


175 


in the y-space, which can be regarded as obtained — ^in the manner 
indicated — ^by combining cells from the jM-spaoes for the n different 
molecules. The condition of the ensemble can then be described by 
giving the probabilities 8«y for finding a member of the ensemble in 
the different r^ons Tc into which we have thus divided the y-space. And 
the quantity M, corresponding to this condition of the ensemble, will 
be given in accordance with (61.7) by 

^ = log -Pft Swy, ( 61 . 28 ) 


where we take a summation over all regions h in the y-space. 

We now wish to obtain ^ interrelation between the two different 
kinds of quantities H and E given by (61.26) and (61.28). In agree- 
ment with the considerations of § 27, we note that there would be a 
total of 


(? = 


n\ 


n^i! 


(51.29) 


individual equivalent elementary regions, of extension Svy in the phase 
space, between which we do not distinguish when we determine the 
condition of a system by the numbers of molecules assigned to the 
different cells i of the /i-space. Hence, using the symbol to designate 
the number of regions & in the y-space, which correspond to a particular 
such condition indicated by the subscript k, we may now write as an 
expression for the total probability of finding a member of our 
ensemble in the condition k 

P,= 0,P^hVy, (51.30) 


where is the probability of finding a member in each elementary 
region Tc that corresponds to the condition k. Solving (61.30) for and 
sub^ituting in (51.28), we then see that we can write our expression 
for R in the form 



P P 

O — - — -loff 


where we now have a summation over the different kinds of condition 
K for a single system. 

This expression will now give the desired interrelation between 
the two kinds of quantities H and E. For this purpose we note that the 
expressions for H and 0, given by (51.26) and (51.29), together with 



BOLTZMANN’S H-THEOREM 


176 


Chap. VI 


the use of Stirling’s approximation for factorials, will make it possible 
for any particular condition k to substitute 


log const,, (51.32) 

(Xf^OVy 

where is the value of our original quantity H for a single system 
in the condition k. Hence (61.31) may now be rewritten in the form 


E = '2, PkEk+ I P,(logP„+const. 

K K 

— Egy^.+ 2, P^logP^+const., (5l'.33) 

K 


where we use the symbol Eg^ to denote the mean value, for all the 
members of the ensemble, of the quantity H applying in the original 
form of the H-theorem to a single system. This expression now makes 
it easy to obtain a clear idea as to the relation between the new and 
old forms of if-theorem. 


_In the first place, we see that (51.33) expresses the present quantity 
H by which we characterize the condition of the representative en- 
semble for a system of interest as the sum of two terms, the first of 
which, Eg^, is the mean value for the members of the ensemble of the 
previous quantity H by which we directly characterized the system 
itself, and the second of which, describes the distribution 

of the members of the ensemble over different conditions #c. For the 
special case Pk = 1 corresponding to an ensemble which has just been 
set up to represent a system which has been observed to be in a parti- 
cular condition «, we note that the above connexion would reduce to 


the simple form 


E = fT-j-const., 


(61.34) 


where E applies to the ensemble and H applies directly to the system 
of interest. 


In the second place, differentiating (61.33) with respect to the timA 
and making use of the circumstance that the total probability for all 
conditions k has to remain constant, we can write 



We then see that the rate of decrease in E with time, which we now 
take as characterizing the behaviour of the representative ensemble, is 
the sum of two terms, the first of which expresses a decrease with tima 
in the mean value for the members of the ensemble of the quantity H 



§51 


REMARKS ON GENERALIZED f?-THEOREM 


177 


which we previously took as probably decreasing with time for the 
system of interest itself, and the second of which expresses a change 
in the direction of more uniform occupation of the different possibte 
conditions /c. Since it will be seen from (51.33) that a minimum for H 
would require the distribution 

= const. e~^/c (51.36) 

in each energy range E to E+hE, it will be appreciated that the genera- 
lized theorem really provides the more direct understanding of the 
tendency for a system to proceed towards conditions k where low 
rather than the absolutely smallest possible value of H may be expected. 

(e) Concluding remarks on the generalized ff-theorem. We may now 
bring this long section to a close with a few further remarks as to the 
character and validity of the generalized f?-theorem. 

In the first place, it will be of interest to make some qualitative 
i^marks as to the rapidity with which we can expect the decrease in 
S with time to commence, when we set up an ensemble to represent 
a system in a known initial condition. In accordance with the discus- 
sion of § 51(c), we see that this decrease results from the failure of 
representative points of the ensemble, originally present in any parti- 
cular small but finite region Sg^i ... with the uniform density p, to 
move together as a compact whole without change in ‘shape' of the 
extension which they occupy. We see, however, that we can usually 
expect such changes in ‘shape’ to start occurring at once since the 
precise behaviours of representative points in different parts of the 
finite region ... Spf will be very different. For example, in the case 
of molecular collisions, since the precise states of the different members 
of the ensemble can correspond to wddely different spatial positions 
for molecules within finite volumes SxSySz, it is evident that such 
differences in behaviour will occur as soon as collisions take place. We 
also see from such considerations that we can usually expect a con- 
•touance of such differences in behaviour and a continued decrease in 
S as further collisions occur. 

In the second place, it will be of interest to make some further 
remarks as to the validity of our conclusion that the final condition of 
our representative ensemble would be given in typical cases by an 
approximate uniformity of the coarse-grained density P throughout 
any energy range E to E-\-8E. Here it is evident that the arguments 
leading to such a conclusion can only be of a qualitative and plausible 
character, unless we are willing and able to undertake the more precise 

3S95.25 2b 



178 


BOLTZMANN’S N-XHEOBEM 


C!hap. VI 


kind of mechanical investigation which it is the function of statis- 
tical mechanics to try to avoid. In this connexion the analogy of 
Gibhsf as to the effect of stirring on a mixtme of water and ink is 
appropriate. 

If we consider a vessel contaiuing portions of water and Tum-diffusible 
inlrs of different degrees of blackness, originally in an unmixe d condi- 
tion, it is evident that nearly any kind of sUrring can be expected to 
result in an ultimate uniform grey, when looked at from a rough-grained 
point of view that neglects the fine-grained densities of the individual 
of water and ink that have been drawn out by the stirring. 
The analogy is sufficiently good to provide soimd reasons for believing, 
in the case of the ensembles discussed, that the different fine-grained 
densities of distribution p, within a shell of the phase space corre- 
sponding to an energy range E to E-\-8E, would ultimately be so mixed 
up as to give us an approximately uniform coarse-grained density P 
within that range. 

Some further remarks with respect to the approach of an ensemble 
towards its ultimate condition will also be of interest. It will be noted 
that the behaviour of ensembles that we have described is evidently 
what we can expect in general, but that singular behaviours of a dif- 
ferent kind might occasionally occur. For example, if we consider the 
behaviour of an ensemble over an exceedingly long time, it is evident 
that we might encounter a reooncentration of distribution in the phase 
space, which would be analogous m the above illustration to an ‘un- 
mixing’ of the water and ink that could occur during an infinite period 
of stitring. This we regard as improbable rather than impossible. As 
another example, if the possible motions of the system axe of an exceed- 
in^y simple type, it is evident that our original determination of condi- 
tion might be sufficient to exclude the distribution permanently from 
large regions of the phase space of the correct energy, as would be 
analogous in the above illustration to the possibility that no amount 
of stirring of the kmd employed could lead to a really complete mixing. 
Nevertheless, iu the ease of S 3 ^tems composed of many molecules, it 
is evident that the highly singular behaviours, which a single system 
could carry out if originally started off with a very special arrangement 
of molecules, would usually be unimportant owing to the finite size of 
the r^ons Sq ^ ... 8pf which we originally populate with representative 
points. 

t Gibbs, Elemenimy Principles in StaHsHeal Mechanics, Vale Biiiversity Press, 1902, 
chapter xiL 



§ 61 REMARKS ON GENERALIZED N-THEORBM 179 

It is also of mterest to inquire into the rapidity with which the final 
equilibrium distribution of the ensemble would be approached. It is 
evident that quantitative answers to such questions would depend in 
a specific maimer on the properties of the kind of system under con- 
sideration. It is important to note, nevertheless, if we divide the phase 
space into regions Sg^ ... hpf of a magnitude large enough to correspond 
to the kind of observational accuracy usually attaiaable, that we can 
often expect a reasonably uniform distribution of representative points 
into different regions of the right energy to take place quite rapidly. 
There is no reason to assume, with observational control of ordinary 
accuracy, that the intervals of time, necessary for a practically uniform 
distribution of coarse-graiaed density P, would be of the order of the 
Pomcar6 periods necessary fqr a single system to return to the im- 
mediate neighbourhood of its origiual state.f 

In concluding this chapter on the JEf-theorem it is evident that we 
must now regard the original discovery of that theorem by Boltzmann 
as supplemented in a fundamental and important manner by the deeper 
and more powerful methods of Gibbs.J 

t Note the contrary opinion expressed by P. and T. Ehrenfest, EncykL d. Math. 
Wise. IV. 2, ii, Heft 6, p. 61. 

J For a modem appreciation of the contributions to statistical mechanics of Gibbs, 
see the article by Epstein, * Critical Appreciation of Gibbs’ Statistical Mechanics’, in 
Gonvrnentary on the Scientific Writinge of J. WiUard Qibbs, vol. ii, Yale University Press, 
1936. 



VII 

THE ELEMENTS OE QUANTUM MECHANICS 

A. HISTOEIOAL BEMABKS 

52. The necessity for modifying classical ideas 

-The development of physical theory from the time of Galileo and 
Newton to the end of the nineteenth century took place as a continuous 
process validated at each step by an increased understanding and 
mastery of the physical world. To be sure, further concepts beyond 
those of Newton were added, more powerful mathematical methods 
were introduced, and the considerations of mechanics were increasingly 
supplemented by those of electrodynamics, thermod3naamics, and 
chemistry. At no time, however, did it appear obligatory to re-examine 
the fundamental ideas as to the nature of space and time and as to 
the. character of observation and measurement, which were implicit in 
Newtonian procedure. Nevertheless, already before the end of the 
nineteenth century, physical theory had encountered two difficulties 
whose later treatments were to involve the re-examination of those 
fundamental ideas. 

The first , of these difficulties was that of reconciling the accepted 
wave-like propagation of electromagnetic disturbances with the failure 
of all attempts — ^in particular that of Michelson and Morley — ^to detect 
the earth’s motion through an ether suitable for such a propagation. 
The solution of this difficulty, at the hands of Einstein, was made 
possible by a deeper criticism of the nature of the processes by which 
spatial and temporal determinations can be made. The new ideas thus 
obtained as to the nature of space-time specifications, as now incor- 
porated in the special and general theories of relativity, have affected 
the whole of physical thought. The scope and character of the effects 
in the realm of macroscopic physics, includmg the gravitational be- 
haviour of astronomical bodies, are now well understood; and the 
character of many of the effects in the realm of microscopic, atomic 
phymcs are also clear. 

The other of the two difficulties for nineteenth-century physics was 
that of explaining the failure of electromagnetic energy to distribute 
itself uniformly over all the possible modes of vibration in an enclosure 
contahaing radiation which has come to thermal equilibrium. A reason- 
ably satisfactory solution of this problem, and of others which proved 
to be associated with it, has only been made possible by a criticism of 



§62 


ENERGY LEVELS 


181 


the very nature of physical observation, with a resulting appreciation 
of the uncontrolled character of the effects that measurement itself 
must produce on systems — ^particularly microscopic ones — when under 
observation. Our present system of quantum mechanics may be re- 
garded as the ultimate outcome of such criticism. The actual process, 
however, by which physical theory was led from classical mechanics 
to quantum mechanics involved several steps which we may first briefly 
describe. 


(a) Discrete energy levels. In the year 1900 a formula was derived 
by Planckf which did agree with the observed distribution of radiation 
as a function of frequency inside a hollow enclosure which has come 
to thermal equilibrium. To obtain the derivation the radiation was 
treated as being in interaction with a set of electric oscillators, of all 
different frequencies, which were themselves assumed to be capable of 
absorbing or emitting energy only in discrete quanta. For each given 
frequency v the magnitude of these definite amounts of energy was 
found to be hv, where A is the new physical constant, having the dimen- 
sions of action, to which Planck’s name is now attached. 

Thus was introduced into physics the new idea of atomic systems — ^i.e. 
the electric oscillators — shaving a set of discrete m&rgy leads separated 
by differences in energy which are related to the frequency of the 
radiation absorbed or emitted in passing from one level to another by 


the expression 


— E-^ = Thv. 


(62.1) 


The further usefulness of this idea of discrete energy levels was soon 
made evident by the work of Einstein,J Debye,|l and othersft on the 
specific heat of solids. By assigning discrete energy levels — hv apart — 
to the different modes of vibration of a crystalline solid, it was shown 
possible to give a good account of the actual energy content of crystals 
as a function of temperatme, including in particular the low tempera- 
ture range where the content is much less than would be predicted in 
accordance with the classical law of Dulong and Petit. 

The theoretical fertility of the idea of discrete energy levels was per- 
haps most importantly extended, however, by Bohr’sJ J work of 1913 
on the hydrogen spectrum. The atom of hydrogen was treated, on the 
basis of the nuclear model of Rutherford, as consisting of a single 


t Planck, Verh, deutach, Phya. Qea, 2, 202, 237 (1900). 

J Einstein, Ann, der Phya, 22, 180 (1907). 

II Debye, Arm, der Phya, 39, 789 (1912). 

tt Bom and Kten^n, Phya, Zeita, 13, 297 (1912) ; 14, 15 (1913). 
tt Bohr, Phil, Mag, 26, 1, 476, 857 (1913). 



182 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 

electron rotating around a central proton with centrifugal force balanced 
by the electric attraction, and a simple rule was given for selecting 
discrete quantized orbits from the manifold of possible classical orbits. 
The energies of these quantized orbits then provided a set of discrete 
energy levels, whose use in connexion with equation (52.1) would 
account for the actual frequencies of the many lines of the spectrum 
of monatomic hydrogen. And natural extensions of this procedure, 
with the help of somewhat generalized rules for the selection of quan- 
tized states for any kind of periodic motion, f soon brought nearly the 
whole of the complicated field of spectroscopic facts into a considerable 
measure of order. 

Furthermore, on the experimental side the reality of atomic energy 
levels was clearly demonstrated in 1914 by the experiments of Franck 
and Hertz, t in which gaseous atoms could be bombarded by electrons 
of known and controllable velocity. When the kinetic energy of the 
electrons was kept less than that necessary to raise the bombarded 
atoms from their normal state to that of the next highest energy level, 
only elastic collisions were found to occur, with no appreciable loss of 
energy on the part of the electrons as shown by their continued ability 
to pass through an appropriate opposing electric field. As soon, how- 
ever, as the energy of the electrons was increased to that necessary for 
raising the atom to the higher level, inelastic collisions set in and the 
energy absorbed by the atoms was found to be re-emitted in the form 
of radiation of the expected frequency. 

This new idea, that atoms are characterized by sets of discrete 
energy levels so that radiation can be absorbed and emitted in definite 
quanta, is the feature of the new developments which has led to the 
imim quantum mechanics. Its introduction marks a considerable step 
away from classical ideas since there was nothing in the classical picture 
of an electric oscillator or of a planetary atom which would lead us to 
expect that unique properties should be assigned to any particular 
energy levels chosen out of all the possible ones. As we shall see, how- 
ever, it is not the feature of the new developments which is really most 
characteristic or significant for the new mechanics. 

(b) Wave-particle duality. As a consequence of the idea that atoms 
absorb and emit radiation in definite quanta, the further idea was 
introduced in 1905 by Einstein|| that radiation itself — even in spite of 

t Sommeifeld, Ann. d&r Phys. 51, 1 (1916); Wilson, Phil, Mag, 29, 796 (1916); 
Isluwara, Tokyo Math, Phys, Proc, 8, 106 (1916). 

t E^ok and Hertz, Verb, deutsch, phya. Oea, 16, 467, 612 (1914). 

II Einstein, Ann, der Phye, 17, 132 (1905); 20, 199 (1906). 



§62 


WAVB-PA-RTICLE DUALITY 


183 


the successes of the wave theory — might be regarded at least from 
some points of view as having a corpuscular character. The energy of 
each corpuscle ox photon associated with radiation of frequency v would 
be given by ^ ^ 2) 


in agreement with our previous relation between energy levels and fre- 
quency; and the momentum of the photon in the direction of propaga- 
tion would be given by 


hv h 

^ = 7 = a 


(52.3) 


in agreement with the relativistic relations between mass, energy, and 
momentum, where the wave-length A = c/v is substituted in the second 
form of writing. 

This idea of the corpuscular character of radiation was immediately 
useful in explaining the photo-electric emission of electrons from an 
irradiated metal, since these are found to come off with energies which 
do not depend on the intensity of illumination but do increase with the 
frequency of the radiation in the way to be expected if due to the action 
of corpuscles of energy of amount hv. And Einsteinf was also able to 
derive the Planck law for the distribution of radiation at thermal 
equilibrium by studying the steady state which would be reached if 
atoms can absorb and emit such photons, introducing appropriate 
expressions for the probabilities of these elementary processes. 

Eurther support for the corpuscular character was furnished by the 
discovery of Compton J that the frequency of X-radiation scattered by 
free electrons was reduced by the amoxmt to be expected as a result 
of elastic encounters of photons with electrons. While the later work of 
Bothe and GeigerU and of Compton and Simonff gave direct evidence 
that the scattered photon and electron do indeed come off at the same 
time and in the directions to be expected if energy and momentum are 
conserved in the collision. The justification for assigning a dual wave- 
hke and particle-like character to radiation thus became evident. 

The idea of such duality was then extended also to matter by L. de 
Brogliejij; in 1924. As the inverse of Einstein’s association of particles 
with the well-known waves of electromagnetio radiation, it was now 
proposed to associate some kind of waves with the well-known particles 

t Einstein, Verh, deiOach, phys. Qes. 18, 318 (1916). 

Compton, Phys, Reo. 21, 483 (1923). 

II Bothe cmd (leiger, ZeUs.f. Phys, 32, 639 (1926). 
tt Compton and Simon, Phye, Rev, 26, 289 (1926). 
tt L. de BrogHe, PM. Mag, 47, 446 (1924). 



184 


THE ELEMENTS OF QUANTUM MECHANICS Chap VII 


of matter. This can be done by assigning to a particle of energy E and 
momentum p waves of the frequency v and wave-length A given by 
the previous equations (52.2) and (52.3) for the case of the photon. 
Using relativistic expressions for the energy and momentum of the 
particle, and the usual dependence of phase velocity and group velocity 
on frequency and wave-length, this is then found to provide waves 
whose phase velocity is greater than that of light but whose group 
velocity is that of the particle itself, in agreement with the idea, as we 
shall see more clearly later, that the motion of a particle can be related 
to the behaviour of a wave packet. 

The importance of this idea of associating some kind of waves with 
material particles was made very evident in 1927 by the electron dif- 
fraction experiments of Davisson and Germer.f A nickel crystal was 
bombarded with a stream of slow electrons of controlled velocity, and 
the angles investigated at which a maximum reflection of electrons 
occurred. Begarding the surface of the nickel as a grating with a 
spacing determined by the known crystal structure, it was then indeed 
found that selective reflection did occur at the velocities and angles 
which would be predicted as a result of the interference and reinforce- 
ment of waves with the de Broglie wave-length (see 52.3) 


* = A 

p mv^ 


(62.4) 


where m and v are the mass and velocity of the electron. Many further 
examples of such interference phenomena with other methods and other 
particles are now known. 

We are thus led, both in the case of radiation and of matter, to some 
kind of wave-particle duality which seems at first sight to involve the 
assignment of contradictory properties to the same entity. As we shall 
later see, one of the main task's and successes of the quantum mechanics 
has been to resolve the apparent conflict between opposing points of 
view as to structure which seemed classically irreconcilable. 

(c) Uncertainty, complementarity, and indetermination. We must 
now turn to the fundamental considerations of Heisenbergf and of 
Bohr[| as to the conceptual possibilities of observation and measure- 
ment in the case of atomic systems. In classical thought it is tacitly 
assumed that the changes that take place in a mechanical system with 
time can be followed by making observations and measurements which 


t Davisson and Germer, Nature, 119, 668 (1927). 
t Heisenberg, Zeits.f. Phys. 43, 172 (1927). 

II Bohr, Nature, 121, 680 (1928); NaturwiBsemehaftm, 16, 246 (1928). 



§62 TJNCERTAINTy, COMPLEMENTARITY, AND INDETERMINATION 186 

do not themselves disturb that behaviour in an uncontrolled manner. 
As soon as we think about the possibilities of observing the behaviour 
of atomic systems, however, we immediately see that efiEects resulting 
from the very act of measurement, which would be negligible in the 
case of the macroscopic systems of classical mechanics, may now become 
very important. 

It is characteristic of the new kind of considerations that they incor- 
porate the two types of phenomena, described by particle language and 
by wave language, as will be strikingly illustrated by the example of 
the y-ray microscope which we are about to discuss, where we shall 
have to use both particle concepts and wave concepts in different parts 
of the discussion, in order to give a complete treatment of the situation. 
The consistency of such dual descriptions is ultimately saved by a care- 
ful limitation of the extent to which the two modes of description and 
the regularities to which they correspond do in fact preclude each 
other; in this connexion a recognition of the disturbances produced by 
observation becomes essential (cf. remarks at the beginning of § 61 ). 

The nature of the disturbances produced by observation may be 
illustrated by Heisenberg’s treatment of the effects that would result 
from a mea,surement of the position of a free electron, let us say its 
positional coordinate Ja- = a; along the rc-axis. To carry out such a 
measurement, we may think of the electron as illuminated with radia- 
tion of short enough wave-length to give an accurate determination of 
position — say y-rays — and as then observed in a ^y-xa,j microscope’, 
pointed at right angles to the a:-axis, and having a large enough aperture 
to give the desired resolution. It is, of course, evident that the scatter- 
ing of the y-rays by Compton collisions with the electron will produce 
a change in the momentum of the electron, and to make this as small 
as possible we may allow a single photon to be scattered into the 
microscope and recorded on a sensitive plate. We immediately perceive 
that the situation presents a competition between the accuracy with 
which we can observe the position of the electron and the uncon- 
trollable change in the corresponding momentum which will be 
introduced by the Compton collision. On the one hand, treatiog the 
y-rays as waves obeying the principles of optics, we see that the 
accuracy of the determination of position will be improved the higher 
we make the frequency of the radiation and the larger we take the 
aperture and hence resolving power of the microscope. On the other 
hand, treating the y-rays as photons and regarding only a single photon 
as scattered, we see that the extent of the uncontrolled change in the 

8696.25 



186 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 

momentuin of the electron, as a consequence of Compton scattering, 
will become greater the higher the firequenoy and hence momentum 
of the photon, and the greater the aperture and hence uncertainty as 
to the direction of scattering and consequent fraction of the momentum 
transferred to the electron in a direction at right angles to the micro* 
scope. 

Calling AJjb the uncertainty in our knowledge of the position of the 
electron and Apa, the uncertainty introduced by the collision into 
the corresponding component of momentum, simple calculation shows 
that the product of the two uncertainties will at least be of the order 
of Planck’s constant A. Since similar relations will evidently hold for 
the y and z directions, we may now write 

Ap^Aqg.fsiTi, 

ApyAqy « A, (62.5) 

ApgAqgtuh. 

as general expressions for the products in question. 

It is to be noted that the uncertainties introduced into the com- 
ponents of momenta by the type of collision considered above would 
destroy any previous more exact knowledge as to momentum which 
we might have obtained, for example by examining the Doppler effect 
in radiation reflected by the moving electron. Hence the above equa- 
tions prescribe limits to the knowledge of the simultaneous values of 
the variables p and g which can be obtained by this method of measure- 
ment, and it is found that these limits cannot be lessened by other 
methods of measurement, for example by observing the position first 
and the momentum afterwards. It should be noted that the limitation 
applies only to the conjugated pairs of variables consisting of a co- 
ordinate and its corresponding momentum, since the same difficulties 
would not be encountered in the simultaneous determination of two 
coordinates, or of one coordinate and of a component of momentum at 
right an^es to the coordinate axis. Later, in § 62, we shall also find 
that the conjugated quantities energy and time are connected by a 
similar restrictive relation 

AEAt « A, (62.6) 

having a physical content which will be investigated. The above equa- 
tions (62.6) and (62.6) are particular examples of the general relation 
knovm as Heisenberg’s unceTtomty prinx^U. 

The relationship between two classical variables, whose values are 
subject to intercoimeoted uncertainties as in the above, has been called 



§62 


COERESPONDENCE PRINCIPLE 


187 


complemmta/rity by Bohr.f It w to be particularly emphasized in the 
case of such complementary variables that increased accuracy in our 
knowledge of one of the variables can be obtained only at the expense 
of decreased accuracy in our knowledge of the other. This may be 
regarded as an illustration of a general type of complementarity in which 
one kind of knowledge or one method of treatment is connected in a 
mutually exclusive fashion with another supplementary kind of know- 
ledge or method of treatment; for example, the complementarity 
between the treatment of radiation from the wave and from the 
particle points of view. 

The most striking consequence of Heisenberg’s uncertainty principle 
is the ifidetermiTMoy which it introduces into the possibilities of physical 
prediction. In the classical mechanics we grew accustomed to the idea 
that an exact knowledge of the coordinates and momenta of a system 
at a given initial time would then make it possible, with the help of 
the equations of motion, to make an exact prediction as to the future 
behaviour of the system. We now see, however, that such exact know- 
ledge of the initial values of both the coordinates and momenta is not 
possible, and hence must give up our older ideas of the possibility of 
exact prediction and of a complete causal dependence of the later on 
the earlier behaviour of a mechanical system. 

Such a conclusion produces a drastic change in the ideology of science, 
the progress of which has hitherto been dominated by the concepts of 
strict causal determinism and predictability; and, to be sure, any e/ntvre 
elimination of the ideas of causation and prediction would surely be 
fatal to science itself. We note, however, that our foregoing considera- 
tions have not necessarily elinoinated the possibility for a sufficient 
degree of causal dependence to permit predictions as to the probable 
behaviour of mechanical systems from the actually allowable knowledge 
of their initial states, even though exact predictions are no longer pos- 
sible. Indeed we shall regard as the most characteristic feature of the 
quantum mechanics the methods which it does provide for making 
predictions as to the expected or average behaviour of systenas. 

(d) The correspondence principle. These new ideas as to the possible 
functions of the science of mechanics are so diSerent from classical ones 
that some anxiety might be felt as to the preservation of an appropriate 
status for classical mechanics in the new scheme of thought, since the 
great achievements of the older mechanics certainly have a wide range 
of valid application. This range of validity, however, is satisfactorily 
t Bohr, Nature, 121, 580 (1928); Natunviseenschaften, 16, 245 (1928). 



188 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 

preserved in the case of macroscopic systems of the kind whose study 
led to the formulation of classical mechanics, since we shall find that 
the probability predictions of the quantum mechanics are then affected 
by such a narrow range of uncertainty as to explain and justify the 
nTnaaip.a.1 treatment, in agreement with our intuitive appreciation of 
the fact that disturbiog effects connected with methods of observation 
must be n^ligible in the case of macroscopic systems. Furthermore, 
the whole development of quantum theory — also when concerned with 
microscopic systems as well — has been carried out in the light of Bohr’s 
correspondence principle which recognizes the fairly general existence 
of limiting conditions where the corresponding classical and quantum 
treatment of a problem lead to converging results. We shall expect 
these limiting conditions to be approached when the degree of uncer- 
tainty, arising from the appearance of the elementary quantum of 
action Ain the uncertainty relations (62.6) and (52.6), becomes negligible 
for the problem in hand. Hence the correspondence principle may be 
applied to the conclusions of quantum mechanics by seeing if they 
approach the corre^onding classical ones as A goes to zero. 

(e) Plan of treatment. We have now completed a meagre account 
of the new ideas which seem most important for the quantum theory; 
and we might hence try to trace the steps by which these new ideas 
could be made to lead to a new theory of mechanics. Such a plan of 
treatment would not be very satisfactory, however, since it is always 
difficult to show that a given set of specific facts and considerations 
are sufficient to give a unique determination of a general theoretical 
structure; and since the actual path of development was, of course, 
affected by historical accidents which no longer seem important. Hence 
we shall actually adopt the preferable plan of a deductive rather than 
an inductive form of exposition, in which we shall lay down a postula- 
tory basis for the quantum mechanics and then show that our postulates 
do imply conclusions in agreement with the new ideas as to energy 
levels, wave-particle duality, indetermination, and correspondence. 

In the next four sections we shall first present and discuss the 
postulatory basis necessary for the new mechanics, which will then be 
summarized in § 67. This will be followed in ^ 68 and 69 by the deriva- 
tion of two fundamental theorems which are necessary for the whole 
of quantum mechanics, and then in §§ 60 to 63 we shall be able to show 
that our new system does incorporate the changed ideas discussed 
above. The remainder of the chapter will be devoted to the further 
development of quantum mechanical methods, and the following chap- 



§62 


PLAN OP TREATMENT 


189 


ter to a derivation of needed consequences of the theory. We shall then 
he ready to discuss the statistical methods which are to he introduced 
for the treatment of systems which are not in a sufBLciently well defined 
state for treatment hy the direct methods of quantum mechanics. 

The postulatory treatment which we shall give will not pretend to 
complete logical elegance and rigour, since we shall not try to ohtain 
the smallest possible number of mutually independent and compatible 
postulates, nor to make sure that all the postulates are actually expli- 
citly stated. Furthermore, the postulatory treatment will actually be 
put in terms especially appropriate for application to the non-relativistio 
mechanics of particles so important for atomic physics. By a suitable 
generalization — in the sense of the classical Hamiltonian theory of fields 
— of the meaning of the coordinates and momenta in terms of which 
our postulates are formulated, we should obtain a suitable basis for 
a relativistic quantum mechanics of particles and a quantum treatment 
of radiation. In spite of inadequacies it is hoped in any case that the 
exposition will be sufficient to assure a correct feeling for the essential 
nature of quantum theory. 

S. THE POSTULATES 

53. The existence of probability densities and amplitudes 

Our first postulate for quantum mechanics gives explicit recognition 
to the idea that the statements and predictions of the new theory must 
for the most part be confined to assertions as to the probability of 
finding particular values for the coordinates and momenta of a mechani- 
cal system at a given time of interest, rather than assertions as to their 
exact values. 

Consider a mechanical system of / degrees of freedom, which would 
be described classically by specifying the exact values of its 2/ co- 
ordinates and momenta ... qf and ... pp and let us now denote by 

W{qx...qf,t)dg[^...dqf and W{pi...pf,t)dpx...dpf (63.1) 

the probabilities at time t that the system would be found on measure- 
ment to have values for its coordinates or momenta lying in the differ- 
ential ranges indicated. The quantities W(g'i ... qpt) and W{p^ — Ppt) 
may appropriately be called probability densities. They will evidently 
have to be real positive quantities on account of the physical nature 
which we have assigned them. 

Let us now express the real positive character of these two kinds 



190 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


of probability density by equating them to the squares of the absolute 
magnitudes of certain corresponding quantities 

•••?/.*) and ... pf,t), 

writing ... q/, t) = tfi*{qi ... qf, t)tjt{qi ... q^ t) 

and W(pi ... Pf, t) = ^*(Pi ... Pp t)i>{Pi — Pp i)> 

where complex conjugates axe denoted by the use of asterisks. We may 

give the name probability amplitndes to the newly introduced quantities 

ift and <l> for reasons which will appear later (§§ 69, 61). In general they 

will actually turn out to be complex quantities as indicated by our 

notation.! 

Our first fundamental postulate may now be regarded as definitely 
asserting the existence of such probability densities W{q, t) and W{p, t) 
with the physical significance which we have attached to them, and as 
carrying the suggestion that an appropriate apparatus for their calcula- 
tion will be found by relating them, as we have done in equations (53.2), 
to the corresponding probability amplitudes t) and ^[p, t). 

We must now examine some of the explicit or tacit implications of 
the postulatory material which we have thus introduced. We may 
begin by considering the expressions given by (63.1) for the probabilities 
of finding values for the coordinates and momenta of a system that lie 
in specified ranges. 

Here it is to be emphasized first of aU that the probability densities 
W{q, t) and W{p, t) are to be thought of as quantities whose values could 
actually be empirically determined by setting up the same physical 
situation over and over again and finding the frequency with which the 
coordinates or momenta do faU at the time of interest t in different 
ranges dji ... dqf and dpj ... dpp This implies in the first place that we 
are going to retain the general classical idea of the existence of co- 
ordinates and momenta whose values can be measured by recognized 
classical methods; and implies in the second place that the results can • 
be r^arded as applying at a perfectly definite time t. The first of these 
implications means that at least some classical methods of measurement 
are going to remain appropriate in the quantum mechanics; and the 

t Two as to zxotatiozi may bo mado at this pomt. Por the purpose of physics 

it is convement to denote complex conjugates by an asterisk *, rather than by a bar “ 
as is done in pure mathematics, since the bar can then still be used to denote mean 
values^ as is customary in physics. In e^ressing the functional dependence of probability 
densities and amplitudes on the coordinates and momenta, it will often be convenient 
to denote the collections of (Quantities ... gy and pi ... pf by the single symbols g and 
p, as is done in the next paragraph. 



§63 


PROBABILITY DENSITIES AND AMPLITUDES 


191 


second implication means that probabilities can be specified in onr 
present non-relativistic quantum mechanics wiihovt introducing any 
range dt in the time similar to our ranges ... dqf and dp^ ... dpf for 
the coordinates and momenta. 

The existence of the probability densities W{q, t) and W{p, t) can also 
be regarded as implying that there is no theoretical limitation — ^which 
we need now consider — on the accuracy with which all of the co- 
ordinates or all of the momenta of a system could be determined. On 
the other hand, the introduction of separate probability densities for 
the coordinates and momenta does suggest that the question of limits 
of accuracy in the simultaneous determination of coordinates cmd of 
momenta is to be left open for later examination. The future develop- 
ment of physical theory may also make it necessary to consider the 
limits which would be imposed — even in the separate measurement of 
coordinates or momenta for themselves — ^by the necessary atomic 
nature of our measuring instruments. But in the present theory we 
have to neglect the possible effect of such considerations.f 

The fact that the introduction of the probability densities W{q,t) 
and W[p,t) has implied the general classical idea of coordinates and 
momenta as observable quantities should not be interpreted as limiting 
the quantum mechanics solely to the treatment of observables which 
have a classical analogue. Indeed, when we come to a consideration of 
the spin of the electron or of other fondamental particles, we shall be 
led to introduce observables which have no classical analogue, and shall 
then make use of probability densities and amplitudes which are func- 
tions of non-classical spin variables as well as of the classical coordinates 
and momenta of the particle. Until we come to that extension, how- 
ever, our statements will not be worded so as to give explicit recognition 
to the possibility of observables having no classical analogue. 

As a necessary consequence of the significance ascribed to the proba- 
bility densities, it is evident, since the coordinates and momenta 
must have unit chance of falling somewhere, that the following equa- 
tions must be true at all times: 

J ... J F (j, i) dji ... = J ... J ^*(g, t) dji ... dg^ = 1 

and J — J W{p,t)dpj^...dpf = ^ 4>*{p,t)^i^,t)dpj_...dpf=l, 

t See Buark, Proc, Nat. Acad. 14, 322 (1928); and Landau and Peierls, ZeUa.f. Phya. 
69, 56 (1931), oonceming limitations on the measurement of position alone. Limitations 
on the possibility of locating an dectron within a region of the order of the Compton 
wave-length have since been found related to the production of pairs of positive and 
negative electrons. 



192 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 

where the integrations are to be taken over the whole range of possible 
values for the coordinates or momenta, and where (53.3) implies that 
the integrals must converge. 

Turning now to the quantities on the right-hand side of equations 
(53,2), the probability amplitudes ^ and <f) will in general actually turn 
out to be complex numbers consisting of a real and imaginary part, 
and are not themselves measurable but are to be regarded as sum- 
marizing the directly observable properties of the system. For example, 
the squares of their absolute magnitudes are real quantities equal to 
the probability densities which can be empirically observed. This is in 
agreement with the idea that the equations of mathematical physics 
are to provide a formalism for computation which leads to results 
capable of empirical determination even though the formalism itself 
contains symbols which have no direct reference to observable physical 
quantities.f 

The justification for the introduction of these complex quantities, 
the probability amplitudes, whose squares are to give the observable 
probability densities, will, of course, only appear in the sequel as the 
usefulness and validity of the whole formalism becomes apparent. We 
shall actually find that these probability amplitudes play a primary 
role in the quantum mechanics, since we shall see later that all we can 
know about a system at any time of interest will be determined by 
giving the instantaneous dependence of such a quantity on the variables 
of which it is a function; for example, by giving the dependence of ^ 
on ... jjF at the time of interest t. For this reason we shall speak of 
the probability amplitudes for a system — such as or others which 
prove possible — ^as being quantities by which we can specify the 
qmnPum mechanical state of a system. 

It may be remarked in passing that the attempt to specify the 
state of a system by using a single real quantity as a probability 
amplitude would not be sufficient to determine the further behaviour 
of the probability field. Just as a dual specification of coordinates 
and velocities is necessary to determine the state of a system 
in the classical mechanics, so, too, a dual specification correspond- 
ing to the two numbers specifying our complex probability ampli- 
tudes is necessary to determine the quantum mechanical state of a 
system. 

t This is contraiy to the apparently pleasing but somewhat unfortunate statement 
that the equations of mathematical physics should contain only quantities which are 
susceptible of direct measurement. 



( 193 ) 


54, The interrelation of probability amplitudes 

In the preceding section we have introduced both the probability 
amplitude ... ?/,0> 'w^hich determines the chance of finding the 
coordinates in a given range, and the analogous probability amplitude 
for the momenta. These quantities, however, are not in- 
dependent but are related to each other, so that either can be calculated 
from a knowledge of the other. As our second postulate for the quantum 
mechanics, we must now give a statement of the mutual interdepen- 
dence of these quantities. 

Since the explicit form of the relation connecting the values of ^{p, t) 
with those of tp(q, t) is not entirely independent of the kind of coordinates 
and momenta which are being employed, we shall state the relation in 
the form for so-called ocmoniccii qumtinim mechanical coordinates amd 
momenta. In the case of a mechanical S 3 retem composed of particles, 
the ordinary Cartesian coordinates and corresponding momenta of the 
individual particles furnish such a canonical set, and, starting with these, 
a transformation to other kinds of coordinates and momenta can be 
undertaken whenever desirable. Coordinates and momenta, which are 
canonical in the quantum mechanical sense, always have the important 
property of being physically interpretable over the whole range of 
possible values from minus to plus infinity. 

Using such canonical coordinates and momenta, the relation of the 
probability amplitudes (f> and can now be simply expressed with 
the help of a Fourier integral by the equation 

i>iPi - Pf>t) = / - J dq^.... dqf, 

(64.1) 

where h is Planck’s constant, i is the symbol for the imaginary root 
^(— 1), and the integrations are to be taken over the total range from 
—00 to +00 for which the coordinates g have significance. 

By the ordinary rules for the inversion of Pourier integrals (see 
Appendix II c) this equation can be solved in a form to permit the 
calculation of ^ as a ftmction of the g’s for any given distribution 
of ^ as a function of the y’s. Furthermore, we can readily write 
down the analogous relations for the complex conjugate quantities 
and Introducing some obvious abbreviations, we can then 
rewrite equation (54.1), together with the three further equations 
which it implies, in the following forms which will be useful for future 
reference: 

3595.25 n a 



194 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


MO - A--' 

^.(P, t) = h-uj r(a. df. 

r(q.O = i-l' 

This part of our postulatory basis is of great significance for quantum 
mechanics. It shows that the specification at a given time t either of 
the probability amplitude ^ as a function of the coordinates q, or of the 
probability amplitude ^ as a function of the momenta p, is alone suffi- 
cient to determine bo^ the probability density which permits us 
to calculate the chances for finding given values of the coordinates, and 
the probability density <j>*^ which permits us to calculate the chances 
for finding given values of the momenta, and thus also the chances for 
finding given values of various other dynamical quantities which are 
functions of the coordinates and momenta. Indeed, in the quantum 
mechanics we r^ard the instantaneous state of a system as fuUy given 
when we have specified the dependence of a single suitable probability 
amplitude — such as <f), or others which prove possible — on the 
variables of which it is a function. 

This possibility of describing the state of a system in alternative ways, 
by using one or another of various possible probability amplitudes is 
a very characteristic feature of the quantum mechanics. The different 
kinds of probability amplitudes are themselves functions of different 
sets of variables, ^ and for example, being functions of the g’s and 
^7’s respectively. Hence it is sometimes convenient to say that we are 
ufflng a particular kind of language, for example the g-language or the 
jj-language, to describe the state of our system; or we can also say that 
we are using a y-representation or a p-representation of that state. 
Equations (64'.2) give us the necessary apparatus for transforming from 
the f-language to the 53 -language and vice versa, and similar equations 
become available for transforming to other kinds of language. Factors 
such as the quantities 

and «*+"•+»/«/> 

occurring in equations (54.2), which are used in transforming from one 
language to another, are known as tramformation functions for the sets 
of variables involved. 



§54 INTERRELATION OF PROBABILITY AMPLITUDES 195 

Our postulatory basis for transforming between the y-language and 
p-language as given by equations (54.2) was set up for the case of 
canonical coordinates and momenta whose range of signidcant values 
was assumed to go from minus infinity to plus infinity. If we desire to 
use coordinates and momenta for which this is not true, for example 
to use polar coordinates and their corresponding momenta, it is best to 
set the particular problem up in Cartesian coordinates and carry out 
a later translation to the final coordinates and momenta desired. The 
consideration of transformations to other than g- andp-representations 
and of transformations when non-classical variables are involved will 
have to be postponed. 

55. The operators corresponding to observable quantities and 

their use in calculating expectation values 

(a) Preliminary discussion. In accordance with the foregoing, the 
state of a system at any time of interest can be specified by giving 
the instantaneous dependence of a suitable probability amplitude on 
the variables of which it is a function. For this purpose we may use 
either the probability amplitude ... qp t), which is a function of the 

/ coordinates of the system, or the probability amplitude ... pp t), 
which is a function of the corresponding / momenta; and making use 
of equations (54.2) we can transform the specification of state from one 
of these sets of variables to the other; i.e. translate between the q- and 
p-languages. We already appreciate that such specifications of state, 
with the help of the relation between probability amplitudes and 
probability densities given by (53.2), make it possible to calculate the 
probabilities of finding different values of the coordinates and momenta 
when a system in some specified state of interest is examined. In the 
present section we now wish to discuss the possibility of using such 
specifications of state to calculate the mean value,, or so-called expecta- 
tion value, which would be found for any function F{q,p) of the co- 
ordinates and momenta, as the average result of a series of measure- 
ments made on that quantity when the system is m a specified state 
of interest. 

In the case of quantities which are fonctions solely of the coordinates 
q or solely of the momenta p of the system, the calculations of such 
mean values can be carried out without further addition to our postula- 
tory basis. In the case of a quantity F{q^ ... qf) which depends solely 
on the coordinates of the system, the calculation can be made most 
simply with the help of a specification of state in the g-language. The 



196 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 

probability of finding the coordinates of the system in any specified 
range ... dq^ wiU then be given, in accordance with (63.2), by 
^*{q,t)^{q,t)dqi ... dq^ and the mean value of F[q) at the time t will 
hence be given by 

F{q, 0 = J ... J ^*(?, t)F{q)>li{q, t) dq^ ... dqf, (56.1) 

where the integration is over all possible values of the coordinates 
?i ••• ?/j the reason for choosing the order of writing wdth F{q) 
placed between and ^ will be seen later. Similarly, in the case of 
a quantity F{p), which depends solely on the momenta, the mean value 
can be simply expressed in the j 9 -language by the equation 

F[p,t) = J ••• / dpi ... dpf, (56.2) 

where we now take an integration over all possible values of the 
momenta. 

We thus see that the calculation of the expectation value for a func- 
tion F{q) of the coordinates wiU be straightforward in the g-language, 
and the similar calculation for a function F{p) of the momenta will be 
straightforward in the f)-language. Nevertheless, it will also be possible, 
in the case of functions which can be expressed as polsmomials in the 
g’s or the j?’s, to carry out the calculations with the other choice of 
language. This we must now investigate, since it will lead us to the 
important idea of the operators which correspond to observable quan- 
tities in the quantum mechanics and which make it possible to express 
the expectation values for such quantities in any desired quantum 
mechanical language. 

As a simple illustration, let us investigate the possibility of expressing 
the mean value of one of the momenta for the system, say in terms 
of the g-language. For this purpose we can start out by using the 
jj-language and write 

Pk = j - / - #/ (66.3) 

as an evident expression for the mean value of jpj, in terms of the 
momentum probability amplitudes ^*(Pi ...l?/,#) and <f>{Pi ... 
which correspond to the state of the system at any time t. This, how- 
ever, can be re-eaipressed in the j-language with the help of our funda- 
mental postulated relation 

... Pi, t) = A-W J ... J ^(ff, ... qf, dq^ ... dq,, (65.4) 



§65 


REMARKS ON MEAN VALUES 


197 


which connects the two kinds of probability amplitudes. Substituting 
(55.4) in (55.3), we obtain 


pk = J - J J - J _ d4fdpj _ ... dpf. 


Introducing a differentiation with respect to the coordinate g* that 
corresponds to the momentum of interest, this can be rewritten in 
the form 


^ JJ--- J 

And carrying out a partial integration with respect to g*, this then 
gives us 

y, = A-*/ J ... J J ... J M dji ... ... &p„ 


(65.5) 


since we take xjt{q, t) as equal to zero at the limits of integration qj^ = ±oo 
owing to our present interest in systems where the probability of finding 
infinite values of the coordinates is vanishingly small. As our final step 
we may now again introduce the connecting relation between the two 
languages, this time in the form 

... qf,t) = h-if j ... J ... dp, ... dpf, 

(65.6) 


which can be obtained from the original form (56.4), as we have already 
seen in the preceding section. Substituting (56.6) in (65.5), we then 
obtain the desired result 




which provides an expression for the mean value of the momentum pj,. 
in terms of the coordinate probability amplitudes ift*{q, t) and tft{q, t). 

Comparing (55.3) with (55.7), we now see that the two expressions 
for the mean value of pj.. 


fk = J - J ^ J "* J ^ ■" 

are actually quite similar in form in the two diffffl:ent languages. In 
the ^J-language we insert the quantity pj^ itself between the two func- 
tions <i>*(p, t) and <f>{p, t), which describe the state of the system in that 
language, and then integrate over all values of their arguments. In the 


Th d 

j-language we insert instead the differential operator — . — between 

2m og'ft 

the two functions tj/*{q, t) and ^{q, f), which now describe the state of the 



198 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


system, and again integrate over all values of their arguments. More- 
over, this result can be immediately generalized, since by a simple 
extension of the above calculation we readily find that the mean value 
of any rational product of the momenta of the form PiP^ can be 
calculated in the g'-language with the help of the corresponding operator 

(JL jy (JL lY 

\2'jTi dqj \ 27 ri dqj \ 27 ri dqj 

where the exponents indicate the number of times the differential 
operator is to be applied in succession to ^(q^t). Similarly, by the 
analogous calculation, we also find that the mean value of any rational 
product of the coordinates of the form calculated in 

the _p-language with the help of the corresponding operator 

\ 27ri dpj \ 2rri dpj * " \ 27ri dpj 

Hence it now proves possible to calculate the expectation values for 
any functions F(qi ... q^) or F{pi ,..pf), which are polynomials in the 
2 ’s and p'i alone, with the help of expressions of the form 


F[q) = J ... J ... qf)tjf dq^ ... dqf 


and 


= J ... J <f>*F{pi ... p,)^ dpi ... dpf. 


dp, 


( 66 . 9 ) 


In accordance with the above findings, it now becomes convenient 
in the qua<ntum mechanics to regard any such observable quamitity F as 
correlated in each language with a corresponding operatcrr F, which 
can then be used to calculate the expectation vabue F of that ob- 
servable quantity by substitution in an expression of the general 
form 

F = j >(i*{x)Ftli{x) dx, ( 66 . 10 ) 

where the probability amplitude ij>* and ^ and the operator F are 
all to be e3q)res8ed in the same language, and the integration is to 
be taken over all values of the variables x which are characteristic of 
that language. As indicated above, the operator corresponding to any 
observable quantity can be conveniently denoted by the anma letter 
printed in Clarendon or bold-face type, and there will be little danger 



§55 


REMARKS ON OPERATORS 


199 


of confusing this with an occasional use of bold-face type to designate 
vector quantities. The properties which we have already found for such 
operators may be tabulated in the form 


g-language 


q* = ?A: 

= JL A 

27ri dqj, 


F(qi ... q/) = F{q^ ... 

F(pi ... p/) 

\27ri dq^ *’ 277 ^ dq^^ 


j9-language 

P* = Pk 
h d 

27Ti 

F(Pi P/) = F{Pi-'Pf) 
F(qi - q/) 

= f ( — — 

\ 27rt dpi 2 


^_a_\ 

2^rt dp^J 


( 66 . 11 ) 


where in both columns of the table the last form can only be applied 
in the case of quantities which can be expressed as polynomials in the 
p’s or q’s respectively. 

As was intimated earher, it will be seen that we have been able to 
obtain the foregoing special results as to the expectation values and 
operators for quantities which are functions either of the g’s or p’s 
alone, without additions to our previous postulatory basis. It is evident, 
however, that some additions may well be necessary in the case of 
quantities containing both kinds of variables, since the foregoing treat- 
ments were all based on the possibility of starting out with expressions 
which contained only a single kind of variable. We must hence turn 
to a consideration of the problem of constructiag operators correspond- 
ing to quantities that depend both on the coordinates and momenta, 
and shall find it desirable to preface the treatment with a somewhat 
general account of the properties of the kinds of operators that will 
interest us. 


(6) Operator manipulation. When a function «(«! ... x„) of certain 
variables ...x^ is changed into another function v of the same or 
other variables by the application of a definite rule denoted by F, we 
can regard the process as an operation, symbolized by the equation 


V = Fa, 


( 65 . 12 ) 


where v may be called the remit produced by the action of the operator 
F on the op&rarvS, u. 

I’or our present purposes we shall be interested in operations which 
leave the result « a function of the same variables x^ ... x^ as appeared 
in the operand a. This limitation arises from the fact that operators 



200 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 

corresponding to observable quantities, when a particular language is 
being employed, will contain no variables not already appearing in the 
probability amplitudes, such as on which they operate. In 

general, however, the quantum mechanics can also be interested in 
operations which lead to ftmctions of new variables, as illustrated by 
equations (54.2) for transforming from one language to another. The 
immediately following remarks will be valid for either kind of operation. 

. The sum of two operators 

S = F+G (66.13) 

may be defined with the help of the equation 

Su = (F+G)tt = Fu+Gu (56.14) 

as the operator which will give the sum of the results of the two 

individual operators. 

The prodmt of two operators 

P = FG (65.16) 

may be defined with the help of the equation 

Pu = FGu = F(Gm) (65.16) 

as the operator which will give the result obtained by operating with 
the first of the two operators on the result which is itself obtained 
from the operation of the second of the two operators. Since the order 
of application can make a difference in the final result, operators are 
in general non-eommuiative, i.e. 

FG # GF. (56.17) 

However, many pairs of operators are, of course, commvitative. 

The successive application of the same operator can be indicated 
with the help of exponents obeying the usual law of indices 

pn+m _ F«p», (66.18) 

If the successive application of the two different operators F and 
F-^ produces no change in the operand, they are called the reciprocals 
of each other. The situation may be s 3 rmbolized by the equation 

FF-i = F-iF = I, (66.19) 

where I may be called the iderdity op&rcdor. 

As will be seen from the above, relations between operators can often 
be conveniently symbolized by equations from which the operands have 
been onodtted. It will be realized, however, that a particular class of 
operands may have to be kept in mind when operator equations are 
stated without the insertion of a specific operand. 



§86 


LINEAR OPERATORS 


201 


(c) Linear operators. For the purposes of the quantum mechamcs 
we shall be specially interested in so-called linear or distrilmtive opera- 
tors. These have the property, when applied to the sum of more than 
one operand, of giving the sum of the results obtained by application 
to each operand separately, in accordance with the equation 


FK-ftta) = F%-f Fttg. (66.20) 

As an additional property of linear operators we shall also require them 
to satisfy the relation = cFu, (66.21) 

where c is any constant. For the case when c is an integer this is a simple 
consequence of (65.20), but for linear operators it is a required condition 
when c is any constant, real or complex. 

If F and G are linear operators it is readily seen that sums and 
products of the form S = c F-f-c G 

P = CoFG 


(65.22) 


are also linear, where c^, Cg, and Cg are arbitrary constants. 

A special significance of linear operators for the quantum mechanics 
arises, as will be seen in §59, firom an important additive property 
possessed by the solutions of equations involving such operators. Con- 
sider the equation Fu{x^,.,xj = 0, (56.23) 


where F is a linear operator involving the same variables ajj ... a;„ as 
the operand u. And let 

% = %(a:i ... »„), 

% = UziXi ... «„) 

be any two solutions for the above equation, giving the explicit de- 
pendence of % and «2 0^ variables ... x„. Since these are both 
solutions, we shall have „ 


and Ftt 2 = 0» 

and, in accordance with (55.20) and (55.21), we shall then also have 

F(Ci%-f-C2tt2) = 0- (66.24) 

Hence, in the case of linear operators, any linear combination of in- 
dividual solutions for an equation such as (56.23) will also be a solution 
thereof. 

The quantum mechanical operators corresponding to observable 
quantities will all be linear. 

(d) Hermitian operators. In the quantum mechamcs we shall be 
specially interested in operators which are not only linear but also 

3696.35 



202 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


Hermitian in character as well. This further property of operators must 
he defined with reference to the class of functions, u{x^ ... a;„), on which 
they are to operate, and the range of values for the arguments, x^... 
of these functions which is to be considered. If %(»! ...x^) and 
^ 2(^1 ••• ®ji) any two of the class of operands to be considered, a 
linear Hermitian operator H is one satisfying the relation 


J ... J «?(»! ... ... «») ... 

= J ... J u^{Xj ^ ... ... «.„)]* dx ^ ... dx^, 

where the formation of a complex conjugate is indicated by an asterisk, 
and the integrations are to be taken over the total range of the variables 
* 1 ... to which significance is ascribed. If, for example, the functions 
u were probability amplitudes depending on the Cartesian coordinates 
or momenta of one or more particles, the range of integrations would 
be taken fi:om minus to plus infinity, and the class of functions con- 
sidered would be those which are free from non-integrable singularities 
and which vanish sufficiently rapidly as the limits of integration are 
approached. Introducing some obvious abbreviations, the condition 
for linear Hermitian operators given above can be written in the 

simplified form j dx = j dx. (55.25) 


We must now investigate the possibility of combining linear Hermi- 
tian operators to give farther operators which are themselves also linear 
and Hermitian. It is evident that the combination of linear Hermitian 
operators by addition will lead to operators which retain the desired 
character. For the case of multiplication, however, a special investiga- 
tion is necessary. 

Let F and G be two linear Hermitian operators suitable for use with 
a class of operands u{Xi ... and let their action on any member of 
the class lead — ^as is the customary case of interest — ^to results which 
are themselves members of the class. In accordance with (56.26) we 
can then write 

J (F%)*G^^2 dx = j UziGFtCj^)* dx, 

«^d. J (Gu2)*F% dx = j u^{FGu2)* 

It will be noticed, however, that the left-hand sides of these two equa- 
tions have been chosen so as to be the complex conjugates of each 
other. We may then equate the right-hand side of one of these equa- 



§66 


HERMITIAN OPERATORS 


203 


tions to the complex conjugate of the right-hand side of the other, 
which will give us the general relation 

J FGita da: = J « 2 (GF%)* Ax. (56.26) 

As another example of this general relation we can also write 

J M^GFatg dx = J dx. (66.27) 

We see at once from the above that the product of two Hermitian 
operators is not itself in general Hermitian unless the two operators 
commute. 

Adding the two equations (55.26) and (65.27), however, we obtain 

J < (FG-f-GF )«2 dx = j « 2 [(FG-|-GF)mJ* dx. (66.28) 

Hence, comparing with equation (66.26), we now see that in general 
the symmetrized product of two linear Hermitian operators, 

H = FG+GF, (66.29) 

will itself be linear and Hermitian. 

Furthermore, subtracting equation (65.27) from (66.26), we obtain 

J < (FG-GF )«2 dx = j Ma[(GF-FG)iti]* dx, 

and, multiplying by i, this can evidently also be written in the form 

I «fi(FG-GF )«2 ^ = j « 2 [*(FG-GF)«i]* dx, (55.30) 

so that we can also combine Hermitian operators to form another 
operator which is Hermitian by the rule 

H = i(FG-GF). (56.31) 

In case the two operators F and G commute with each other, we see 
from (65.29) that their simple product FG would be Hermitian without 
the necessity for symmetrization, and that the Hermitian operator 
given by (55.31) would reduce to zero. Since any operator commutes 
with itself, any polynomial constructed from an Hermitian operator 
will be Hermitian. , 

The expression (FG— GF) occurring in (66.31) is often called the 
commutator for the two operators F and G. Similarly, the expression 
(FG+GF) is sometimes called the avticommutator for the two operators. 

The commutator for two operators is so important that it will be 
convenient to introduce the abbreviation 

[F, G] = FG-GF. 


(56.32) 



204 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


The following properties of the commutator of two operators can be 
readily verified. They apply to linear operators in general: 

[F,G] = -[G,F], 

[F,(G+H)] = [F,G]+[F,H], 

(55.33) 

[F,GH] = [F,G]H+G[F,H], 

[GH,F] = [G,F]H+G[H,F]. 

In the case of two Hermitian operators which do not commute, it is 
sometimes desirable to re-ezpress their product as a sum containing the 
commutator and anticommutator as follows: 


FG = FG+GF , FG-GF 
2 ' 2 

_ FG+GF i(FG-GF) 
” 2 + 2 » 


(55.34) 


Noting (55.29) and (55.31), this re-expression may be regarded as a 
decomposition into a real and imaginary Hermitian part. 

(e) The operators q and p. With the help of the foregoing we are now 
ready to discuss the nature of the operators to be associated in the 
quantum mechanics with observable quantities. 

Making use of the j-language, let us first consider the fundamental 
operators corresponding to the canonical coordinates and momenta 2* 
!Pjc' These will then be given, as we have already seen, by the 
expressions 

q* = ff*. 


_ h d 

27ri dqjg’ 


(55.35) 


where the will have a range of significant values from — oo to -f oo. 
They will be used to operate on probability amplitudes ^(g^ ... q^t) 
which are functions of the coordinates q ^ ... q^ and also of the t. 
Nevertheless, since we shall actually be interested in the application 
of these operators at some particular time, say it will be more con- 
venient to write 

«(?! - ?/) = 0(ffi ... ?/, <o) (56.36) 

as a general expression for the kind of functions on which the above 
operators are to act, where is to be regarded as a parameter whose 
choice will determine the form of function to be considered. 

Krst of all, it will be important to investigate the <xnnmutation 



§65 


THE OPERATORS q AND p 


206 


properties of our fundamental operators. Making use of the explicit 
form given by (66.35) we can write 


, . h d , . h du 



(55.37) 


We hence see that the pair of operators corresponding to a coordinate 
and its conjugated momentum do not commute. On the other hand, 
we easily see by a similar treatment that the operators for two co- 
ordinates, or for two momenta, or for a coordinate and a momentum 
that are not conjugated do commute. Hence we may now summarize 
the commutation properties for our fundamental operators by the 
equations ^ 

(55.38) 

P&Pz — PiP* = 

where 8^^ is equal to unity or zero according as le and I are the same 
or different. These conamutation properties, which can be shown to be 
preserved in all forms of representation as well as in the ^-language, are 
fundamental for the quantum mechanics. The lack of commutabOity 
for the quantum mechanical operators q^. and pj, compared with the 
commutability of the classical variables and pj^ can be regarded as 
an expression of one of the fundamental differences between quantum 
mechanics and classical mechanics. 

As a second important property of the fundamental operators for 
coordinates and momenta, it will immediately be seen ffom the form 
given by (66.35) that these operators are linear in character. Hence 
also, in accordance with (55.22), any polynomial constructed from these 
operators will also be linear. 

As a third important property, it will also be seen that these operators 
are Hermitian. This is immediately evident in the ^-language for the 
operator q*. since we shall obviously have 

J — J ••• = J — J “ 2 ( 9 *! %)* (56.39) 

owing to the fact that the operator q^. signifies multiplication by the 
real variable and the order of multipUcarion mahes no difference. 
To show the Hermitian character of Pf. in the ^'-language is a little 



206 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 

more complicated. We can see this^ nevertheless, from the following 
equations: 






= /•••/ «a(P*«i)* <^1 - % 


( 66 . 40 ) 


where the first form of the equality is obtained by substituting the 
explicit expression for the operator in the y-language; the second form 
is obtained by partial integration with respect to the third form is 
obtained by dropping the integrated term, since in problems of interest 
probability amplitudes such as % and are taken as going to zero at 
the limits qjg = ±6o in order that there should be no probability of 
finding infinite values of the coordinates; and the fourth form of the 
equality is obtained with the help of the explicit form for the operator 
pjs and the rule for finding the complex conjugate of a function. 

This investigation of the commutability, linearity, and Hermitian 
character of the fundamental operators q;^ and p;;. was actually carried 
out in the g-language, making use of the specific forms that these 
operators have in that particular language. It will easily be seen, how- 
ever, from the forms which these operators assume in the ^-language, 


A A 

27ri dpjJ 


Pfc Pks 


( 66 . 41 ) 


that aU the same properties would still be found. It may also be men- 
tioned that these properties are still retained when transformations are 
made to other possible modes of representation than those provided by 
the g- and p-languages. 

(/) The operators corresponding to observable quantities in general. 
Having found the properties of the fundamental operators qj, and p;^, 
we may now state the general rule that the quantum mechanical opera- 
tor corresponding to any classical function of the coordinates and 
momenta is to be obtained therefirom by replacing the variables qj^, and 



§66 QUANTUM MECHANICAL OPEBATORS 207 

Pji by the operators q* and p;;,, in such a way as to secure a linect/r 
Hermitian operator. This rule can be symbolized, for any classical func- 
tion F{q,p) of the coordinates and momenta, by the equations 

q-language p-language 

F(,.ri = 44±). F(,,p) = ^(-^|,„), (55.42) 

We thus supplement the original, evidently appropriate, method of 
obtaining the operators that correspond to functions of the g’s or p's 
alone, as given by (55.11), by a natural extension, as given by (55.42), 
which permits us to obtain operators that correspond to more com- 
plicated functions as well. 

In case the classical expression F{q,p) is a polynomial in the q's and 
^’s, so that it can be written as a symmetrized sum of terms of the form 

■F = 2 ... qjpi ... ... 3? - P^), (55.43) 

there will be no difficulty in applying the above miles, after symmetriza- 
tion has been introduced in accordance with (66.29) to secure the 
desired Hermitian character. It will be noticed, however, that the 
method of symmetrization which we have given is not necessarily 
unique for terms containing more than one pair of coordinates and 
momenta, since qiqjpiPa+PiPaqiqa and qiP 2 q 2 pi+q 2 PxqiP 2 would 
both be Hermitian operators. Nevertheless, difficulties due to such 
sources of uncertainty have not actually proved serious in the quantum 
mechanics.f 

In case the classical expression is a polynomial in the p’s but contains 
arbitrary functions Q of the coordinates, so that it can be written as 
a sum of terms of the form 

•^ = 2 i(QPi ••• P^+Pi - P^ Q)> (65.44) 

it will be straightforward to obtain the form of the corresponding 
operator in the g-language; its form in the _p-language could then be 
very complicated. Similar remarks hold for arbitrary functions in the 
momenta alone which would allow straightforward treatment in the 
j9-language. 

In the case of classical quantities, which have an explicit dependence 
on the time f, owing to the possibility that the form of the function may 
change with time, the above rule is readily extended by using at each 

t The problem of a unique correlation of operators with classical functions has been 
studied by Weyl, Zeits.f. Phys, 46, 1 (1927), and McCoy, Proc, Nat Acad. 18, 674 (1932). 



208 THE ELEMENTS OE QUANTUM MECHANICS Chap. VII 


instant the fonn of F{q,p) which then appKes. This can be symbolized 
by writing 

^-language p-language 


F(q,p.i) = A ij. F(q.p.O = F^-±±,p, j. 


(65.46) 


The foregoing considerations assume that the coordinates and mo- 
menta are canonical in the quantum mechanical sense, for example 
Cartesian coordinates and momenta for the particles of a system, and 
the use of other coordinates can be most safely undertaken by trans- 
forming thereto after setting the problem up in Cartesian coordinates. 
In the case of functions dependent on observables such as the spin, 
which has no classical analogue, special treatment of the quantum 
mechanical operators will be necessary. See § 76. 

(ff) The calculation of expectation values in general. With the help 
of the above generalized definition for the operators that correspond 
to functions of the coordinates and momenta, we shall now postulate 
in general that the ^pectation value F{q,p), for any measurable function 
F{q,p) of the coordinates and momenta of a system, can be calculated 
with the help of the corresponding operator F(q, p) by substitution into 
equations of the general form 


F{q>p) = J «A*(«)F(q, p)i^(re) da, (66.46) 

where the probability amplitudes and 0 and the operator F are aU 
to be expressed in the same language, and the integration is to be taken 
over all values of the variables x which are characteristic of that lan- 
guage. This makes a definite addition to our postulatory basis, since 
we have previously only been able to show that such a procedure would 
necessarily lead to correct expectation values in the case of functions 
of the coordinates or momenta alone. 

By the expectation value for such a quantity F{q,p) we are to under- 
stand the mean result, which would be obtained from a series of 
measurements made when the system is known to be in the specified 
state of interest. Hence we shall only maintain that operators and 
expectation values have a necessary significance in cases where some 
conceivable experimental method for measuring the quantity in ques- 
tion can be devised. The possibility of calculating expectation values 
is very important for the quantum mechanics, since we can now make 
assertions as to what values we can expect to find on the average for 



§65 


EXPECTATION VALUES 


209 


a system in a given state, even though we can no longer make all the 
assertions as to exact values which were thought possible in the classical 
mechanics. 

It is to be specially emphasized that our formalism has been correctly 
devised so that the calculation of expectation values will lead to the 
same result, without dependence on our choice of the q- or j9-language 
in which to make the computation. This can be seen in detail with the 
help of transformations similar to those by which we passed from 
the expression (55.3) for in momentum language to its expression 
(55.7) in coordinate language. It is also to be emphasized that the 
formalism has been devised, so that expectation values will be real 
numbers, in agreement with their significance as a mean result of the 
measurement of real numbers. To see this we note, in accordance with 
(55.46), that we could write for any such expectation value F, for 
example in the g^-language 

J = J il>*{q)F(q,p)if>{q) dq, 
and hence also for its complex conjugate 

= J ^(?)[F(q,p)^(?)]* dq. 


In accordance with the Hermitian character of the operator F, how- 
ever, the right-hand sides of these two equations are equal, so that we 
obtain the result 


F^ = F 


(55.47) 


showing that F actually is a real number. This is, of course, one of 
the reasons for demanding that the operators F should have Hermitian 
character as was done above. 

Finally, it may be noted, since we can compute not only the mean 
value of any measurable quantity F but also the mean value for any 
power thereof, that we can obtain a complete prognosis as to the 
probabilities of finding different values of F when a series of measure- 
ments is made on a system in a given state. 


56. The Schroedinger equation for change in state with time 
In accordance with preceding sections, the instantaneous state of 
a system can be described in the quantum mechanics by speciJfying the 
dependence of a suitable probability amplitude on the variables of 
which it is a function at any time of interest, and from such a specifica- 
tion it is possible to evaluate the probabilities at that time for finding 
different values of the coordinates, momenta, and other measurable 
quantities belonging to the system. This possibility of evaluation. 



210 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 

however, appKes in the first instance only at the particular time for 
which the state of the system has been specified, and we must now 
consider the fundamental question of the change in the state of a 
system with time. 

We shall find just as in the classical mechanics that the state of an 
undisturbed system can be regarded as a definite function of the time, 
so that the state at any later time can be calculated from a knowledge 
of its state at a selected initial time. In this connexion, however, it is 
specially necessary to keep clearly in mind two important differences 
between classical and quantum mechanical states. In the first place 
we must always remain cognizant of the fact that a knowledge of the 
quantum mechanical state of a system allows us in general to make 
statements only as to the probabilities of finding given values for its 
various dynamical variables. In the second place we must not neglect 
the fact that the very process of observing a system will itself introduce 
disturbances, so that a new calculation of the undisturbed behaviour 
will, in general, have to be initiated after each new observation. Indeed 
it is characteristic of the quantum mechanics that we use the informa- 
tion obtained by observation at some initial time to set up an appro- 
priate amplitude function to represent the system, with the help of 
which we can then make predictions as to the probabilities of obtaining 
various results at the later time, of a second observation. The calculus 
of the quantum mechanics is thus suitable for connecting the behaviour 
of a system at a single earlier observation with that at a single later 
observation, rather than useful for a continuous description of the 
behaviour of the system as is essayed in the classical mechanics. 

{a) Postulated form of Schroedinger equation. We are now ready to 
introduce the postulate by which changes in the state of an isolated 
undisturbed system can be calculated. This we do with the help of a 
simple partial differential equation with respect to the time t, which 
can be written in the form 

where ifs is the probability amplitude which specifies the state of the 
system, and H is the so-called Hamiltonian operator for the system, 
which corresponds to the classical expression for its energy, as a func- 
tion of coordinates and momenta, in the manner described in the pre- 
ceding section. In accordance with its discovery by Schroedinger, f and 

t See Schroedinger, Wave Jlfechanics, translated from second German edition, 
London, 1928. 



§56 


THE SCHROEDINQER EQUATION 


211 


in order to distinguish it from a related equation which applies to 
steady states and does not contain the time, the above expression may 
be spoken of as the Schroedinger equation mcluding the time. 

The above form of expression (66.1) for the Schroedinger equation 
may be regarded as a general one, applying in any language in which 
the amplitude tji and the operator H are expressed. Nevertheless, since 
the classical Hamiltonian expression for the energy of familiar systems 
will usually be a simple polynomial in the momenta, even though 
depending in a complicated manner on the coordinates, it is often most 
convenient to regard the equation as expressed in coordinate language 
in the form , , „ , 


since the Hamiltonian operator H\q^ then be readily ob- 

\ ZTTi dql 

tained from the classical Hamiltonian function H{q,p) for the system 


Ji d 

by substituting a differential operator of the form — — in place of 


2vi Sg-ft 


each of the momenta for the system. 

This construction of the appropriate Hamiltonian operator for a 
system is, of course, to be carried out in accordance with the general 
methods described in the preceding section for obtaining the liTiear and 
HermiUanf operators that correspond to observable quantities. As 
already made evident in that section, some difficulties and ambiguitiesj 
may be encountered in applying the methods, and extensions of method 
may be made necessary by the existence of variables — e.g. spin — ^which 
have no classical analogue. Nevertheless, for the most part the con- 
struction of the appropriate Hamiltonian operators for simple systems 
has turned out to be quite straightforwaid. Furthermore, in the case 
of difficulties or needs for extension of method the problem may be 
looked at from the broader point of view of discovering that Hamil- 
tonian operator for the system which will give results that agree with 
experiment. This is, then, similar to the classical problem of picking 
out a Lagrangian function, which will apply when the simple rule 


t For disoTission of a further necessary character of the Hamiltonian operator, see 
Pauli, Handbuch der Physik, xxiv/1, second edition, Berlin, 1933, p. 142. 

i In case an expression for the Hamiltonian operator is desired in polar or other 
coordinates which are not canonical in the quantum mechanical sense, it is best to 
begin by setting it up in canonical coordinates, e.g. Cartesian coordinates for the 
particles, and then transform to the desired coordinates. See Podolsky, Phys. Rev, 32, 
812 (1928}. 



212 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VH 


of setting that function equal to the difference between kinetic and 
potential energy no longer holds as in the usual systems of Newtonian 
mechanics. 

Having obtained an appropriate expression for the Hamiltonian 
operator, in coordinate language, for any system of interest, we can 
then make use of the Schroedinger equation in the form (56.2) to treat 
the problem of predicting the state of the system in its dependence on 
time. This we do by solving that equation for the state as a 

function of with boundary conditions so chosen as to agree with our 
knowledge of the state at the time of initial observation This 

solution can then be used to make predictions as to the expectation 
values for different quantities at any later time t when a second observa- 
tion is undertaken. And, with a %&w choice of boundary conditions to 
agree with the results actually obtained in any such second observation, 
we can then make further predictions for yet later times. 

The foregoing formalism applies in the treatment of isolated systems, 
which carry out their behaviour without any action from the outside, 
except for the disturbances introduced at the tunes when a new observa- 
tion and a new selection of boundary conditions is made. In the case 
of systems which are acted on from the outside in a definite manner, 
so that the form of the Hamiltonian operator can be regarded as a 
specified function of the time, the formalism can be extended by taking 




where the explicit dependence of the operator H on the time t has been 
indicated. 

ip) Some specific examples of the Schroedinger equation. The above 
remarks as to the general form of Schroedinger’s equation may now 
be illustrated by some actual examples of the form of the equation 
expressed in coordinate language. 

As our first example let us consider a single particle of rwajaff rn 
moving in a field of force corresponding to a potential F which is a 
definite function of position. In the classical mechanics we should write 


H = 


2m 


{P%+pI+PI)+V{x, y, z) 


(66.4) 


as the Hamiltonian expression for the energy of the particle, in 
terms of its Cartesian coordinates x, y, z and its corresponding com- 
ponents of momenta p^, py, and p^. In accordance with the rule 



213 


§56 EXAMPLES OP SCHEOEDINGER EQUATION 

given by (65.42), we then obtain as the Hamiltonian operator for the 
system 

H = . /AA'iV 

2m\\2wi dx) '^\27ridy) ' 

;i^2 /02 a2 8^\ 

7)^ 

= y> *)» 


bi' 


where the second form of expression is obtained in accordance with the 
convention that an operator is to be successively applied the number 
of times indicated by the power to which it is raised, and the third 
form of expression is obtaiued by introducing the familiar abbreviation 
for the Laplacian operator. Substituting this expression for the 
Hamiltonian operator into (56.2) and changing signs, we can now write, 
as the Schroedinger equation for a single particle in a potential fidd^ 


or 


STrhn \dx^ dy^ ^ dz^J ^ 


— ^ = 0 
2iri dt 




vv- 


-F^- A ^ 

^ Ziri dt 


0 , 


(56.6) 


where F is to be treated as a function of the Cartesian coordinates 
X, y, and z. 

As a second simple example we may next consider a system com- 
posed of several particles having the different masses m^, m^, mg, etc., 
moving in a field of force which can again be described by a potential 
F which will now be a function of the positions of all the particles. 
Taking our coordinates for the system as a whole as being the Cartesian 
coordinates for the particles of which it is composed, and appl 3 dng the 
same methods as above, we easily obtain, as the Schroedvn^er equotiion 
for a system of particles in a potential fidd, the result 


V /ay ey 8^\ y, h 

^ %-n^k \^l ^kl 2irt dt 


2 

* * 


T72 / T7/ Jl> Sifs ^ 


0 , 


(66.7) 


where m*,, »*., yj^, and z*. are the mass and Cartesian coordinates corre- 
sponding to the Mh particle and the indicated summations are to be 
taken over aU the particles composing the S 3 rstem. 

As a third example of the explicit form of Schroedinger’s equation 



214 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


let US now consider a particle of mass m which carries an electric charge 
e, and moves in a combined electric and magnetic field. On account of 
the presence of magnetic forces, the action of such a field on a moving 
charge can no longer be described by a single potential function F, but 
can be treated with the help of the scalar potential y, z, t) together 
with the components Ay^ and A^ of the vector potential y, z, t) 
used in electromagnetic theory. The intensities E and H of the electric 
and magnetic fields are then given by 

E= -gradO-l^ (56.8) 

C Cut 

and H = curl A (66.9) 

and the force acting on the particle by the Lorentz expression 

F = e(E+ 

■where u is the velocity of the particle. 

As first shown by Larmor.t the classical behaviour of such a particle 
under the action of this force can be described by the usual methods 
of mechanics -with the help of the Lagrangian function 

L = im(i2+^2^22)— eO+-(Aj,*-|-Aj,t)H-A2z). (56.11) 

c 

In accordance -with (10.2) this leads to the components of momenta 




(56.10) 


= :Py = my-\--Ay, Py = mi-{--Ag, (56.12) 

c c c 


and in accordance with (10.3) to the classical Hamiltonian 
H = Py.i-\-pyy-\-pgZ—L 


(66.13) 

where the second form is obtained firom the first by substituting (56.11) 
and (66.12), and the third form has been symmetrized in accordance 
with (55.29) in order to secure an operator having Hennitian character. 

Applying our rule (55.42) for changing this classical Hamiltonian into 
the corresponding quantum mechanical operator and substituting into 


t Laxmor, Aether and Matter, Cambridge, 1900. 



§56 


EXAMPLES OF SCHBOEDINGER EQUATION 


216 


(66.2), we can now write the Schroedinger equation for a charged 
particle in an electromagnetic field in the form 


1 ^ Ih eh 0 ^ ( 

2m ^ \27rt dXjJ c Ini dxj. * ( 






where the subscripts k = 1, 2, 3 refer to the Cartesian coordinates 
X, y, z. Carrying out the application of the operators as indicated in 
the first term and changing signs, this can be re-expressed in the more 
convenient form 




-eO^- A. ^ = 0. (56.14) 

27n 


It win be noticed that each of these three examples of the explicit 
form of the Schroedinger equation has provided us with a differential 
equation for the probability amplitude ... qf,t) of the system as 
a function of the coordinates and time. Since the quantum mechanical 
state of a system is determined at any time by the form of ift, solutions 
of these equations — ^with boundary conditions chosen to correspond to 
any desked initial state — ^wiU then provide explicit solutions of the 
problem proposed at the beginning of the present section, of describing 
how the quantum mechanical state of a system depends on the time. 

(c) Transformation of Schroedinger equation from coordinate to 
momentiun language. As already remarked above, it is often most 
convenient to express the Schroedinger equation in coordinate language 
since the dependence on momenta will jffequently be such as to lead 
at once to a simple expression for the Hamiltonian operator in that 
language. Nevertheless, we regard our original expression of the 
Schroedinger equation (66.1) as valid in any language provided we use 
the appropriate expression for the operator H in that language. In the 
case of a classical Hamiltonian which can be expressed as a polynomial 
in the coordinates as well as in the momenta, the transformation of 
the Schroedinger equation from coordinate to momentum language 
proves simple. 

To investigate this, let us start with the Schroedinger equation 
(56.2), expressed in coordinate language in the form 





k d^(q, i) 

27ri dt 


= 0 , 


(66.16) 



216 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


and assuming the coordinates to be canonical in the quantum mechani- 
cal sense, replace t) by t) with the help of the transformation 
equation (54.2) 

= h-if I ... J ... d^f, (66.16) 

where the integrations are over all values of the jp’s from — oo to -j-oo. 
Substituting (56.16) into (56.15), and noting that the H does not now 
contain the ^)’s, we can write 






+ 


27ri Bt 


It will be seen, however, in case J? is a polynomial in the coordinates 

Jh B 

q and in the differential operators as assumed, that this can be 

^ ^ 2Tn Bq 

rewritten in the form 


^i7T% 3t J 

By carrying out partial integrations with respect to the momenta and 
making use of the circumstance that ^(p, t) will go to zero at the limits 
of integration, this can then be rewritten in the form 

Since this equation must be satisfied for arbitrary choices of the j’s, 
this now leads to the desired result 



h 8<f>{p, t) 
2iTi dt 


= 0 , 


(66.17) 


giving a simple expression of the Schroedinger equation in momentum 
language, with the operator H of the expected form. It should be borne 
iu mind, unless the dependence of on is simple (polynomial) as 

assumed above, that the operator — A will have to be inter- 

preted as an integral and not a diSerential operator. 

For a fecial case, this then illustrates the validity of the assumed 
possibilil^ of expressing the Schroedinger equation either in the q- 
language using the amplitude iff(q,t) or in the p-language using the 



§66 


SUMMARY OF THE POSTULATES 


217 


amplitude We shall usually find it most convenient, however, 

to use a coordinate representation in setting up our problems. The 
possibility of using a Schroedinger equation with probability ampli- 
tudes corresponding to still other observables will appear later. (See 
§ 67(c).) 


57. Summary of postulatory basis 

We have now completed our exposition of the postulatory basis for 
non-relativistic quantum mechanics. The account has been so long that 
a brief summary in the q- and ^)-languages, as given by the following 
equations, will not be out of place. 

For the relation between probabiKty densities and amplitudes we 

W{q,t)dq = 

W{p,t)dp = t)dp. 

For the transformation of probabiKty ampKtudes we have 

= h-ll f 

I) - J ¥S. 

For the operators corresponding to observable quantities we have 




(67.3) 


For the mean or expectation value of an observable quantity we have 


= J dp- 


(57.4) 


For the change of state with time we have the Schroedinger equation 




(57.5) 


The equations as written oontaia some obvious abbreviations and 
must be used, of course, in the light of the preceding discussion. They 
are sufficient for a non-relativistic quantum mechanics except for special 

3595.25 ^ f 



218 THE ELEMENTS OF QUANTUM MECHANICS Chap. VH 

assumptions that have to be introduced in connexion with observables 
having no classical analogue that can be expressed as a function of 
coordinates and momenta. 

C. THEOREMS ILLUSTBATINa THE NATURE OP QUANTUM MECHANICS 

58. Probability density and probability current 
We must now derive a number of theorems from the foregoing 
postulates which will illustrate the nature of quantum mechanics. 
These will include in §§60, 61, 62 those consequences of the new theory 
which are the analogues of the ideas as to energy levels, wave-particle 
duality, and uncertainty that have led us to the belief that the classical 
mechanics is not sufficient. For the most part we shall carry out the 
derivations in the ^-language. 

(a) The conservation of total probability. First of all,^we must in- 
vestigate an important question as to the consistency of the formalism 
which we have set up. In equations (67.1) we have related certain 
probability amplitudes to the actual probabilities of finding values for 
coordinates and momenta that lie in specified ranges, and in equations 
(57.6) we have then also stated how these amplitudes are to depend 
on the time. It is evident for consistency that this time dependence 
must at least be such as to preserve constant the total probability of 
finding values for our variables that lie somewhere within the ranges 
that are possible for them. This we can easily show to be the case. 

For the rate of change with time in the total probability of finding 
values for the coordinates, that lie somewhere, we can write 

where the integrations are to be taken over the total range of possible 
values for the coordinates. With the help of the Schroedinger equation 
we can then re-express this, however, in the form 

I J ^ J ^ = 0, (68.1) 

where the value zero arises in accordance with (55.25) from the Hermi- 
tian character of the Hamiltonian operator H. We now see why it has 
been specially important to prescribe this character for the Hamiltonian 
operator. 

Having found that the total probability does remain constant, it will 
of course be possible to normalize our probability ampUtudes in such 



§68 


PROBABILITY DENSITY AND CURRENT 


219 


a way as to make the total probability of finding the system somewhere 
in configuration space permanently equal to unity: 

= 1. (58.2) 

Similar treatment can be given to the total probability of finding the 
system at least somewhere in momentum space. 

(b) The concept of probability current. It is also possible to investi- 
gate somewhat more in detail the natme of the processes by which the 
probability density can change at a given point while its integrated 
value over the whole of configuration space remains constant. It will 
be sufficient to illustrate this for the case of a simple particle of mass m 
moving in a potential field V{x,y,z). 

In agreement with (66.6) we can then write the Schroedinger equa- 
tion in the form 



h id^ 

av. 


dt'^ 


■ay 2 -' 

'dzy 


1 in the fom 


and the corresponding conjugate equation in the form 

dtft* h /ay* 8V* ey 

dt 4^irim\ 8x^ 8y^ dz^ 

Multiplying the first of these equations by ijt* and the second by ifi and 
then addmg, we obtain 

^ dt dt ^ 

ai‘ ^ Sxf y dy' ^ az> } ’ 

and this can easily be shown equivalent to the form 

- div(0* grad grad ^*) = 0. 




(68.3) 


Hence, if we now define the density of probability current as the 


vector 


S = . {ij>* grad i|fr— ^ grad rji*), 

4aTtm 


(68.4) 


which win be seen to be a reoZ quantity, we can oormeot probability 
density pp- ^ ^50 gj 

and probahUity cmrenA density S by the simple relation 

8W 


dt 


4-divS = 0, 


( 68 . 6 ) 



220 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


■which shows, in agreement "with the conservation of total probability, 
that the change in probability in a given region can be regarded as due 
to flow through its boundary. 

In our present case the components of S have a simple pictorial 
interpretation, for example being the probability in unit time that 
the particle would be found to pass through unit area perpendicular to 
the r-axis, mcving ia the positive direction, diminished by the similar 
probability for passage in the negative direction. 

Analogous treatments of probability current can be given for more 
complicated cases. For a system of charged particles m a combiued 
dectric and magnetic field we can use a configuration space of 3n 
dimensions, where n is the number of particles, and denote the x, y, z 
coordinates for the successive particles by the general symbol y*. For 
the jfcth component of the generalized probability current we then havef 


4 = 



m^c 




(68.7) 


where and e^. are the mass and charge of the particle having as 
one of its Cartesian coordinates and Ah. is the component of the vector 
potential in the direction of the g'j.-axis. 

In analogy to (58.6) we then have as the equation of continuity 


8t ^ 8qji 


= 0 . 


( 68 . 8 ) 


59. The principle of superposition 
Since the two operators — ^for the Hamiltonian and for differentiation 
■with respect to time — ^which occur in the Schroedinger equation 

are both linear, it is evident — ^ia agreement ■with the discussion of 
§ 66 (c) — ^that any linear combination of individual solutions of this 
equation ■wiH also be a solution thereof. Thus, if t) and ^3(2, t) are 
solutions, the combination 

^(q,f) = Ci^i(?,<)+C2^3(y,f) (59.2) 

■will also be a solu'tion, where and can be any constants real or 
complex. This result expresses an important character of the quantmn 
mechanics which may be called 'the prineiq>le of superposition. 

t See Pauli, Handbuch der Phyaik, axiv/l, second edition, Berlin, 1933, p. 109. 



PEINCIPLE OP STJPEEPOSITION 


221 


; 69 

The principle can also be expressed in the somewhat more general 

^ • (69.3) 

where, with an appropriate choice of and Cj, we can regard ^ as 
sxpressed in one quantum mechanical language and and in 
another. As an important example of this possible way of super- 
imposing solutions, we may take our fundamental equation (67.2) 

^(?, t) = h-it J (59.4) 

which gives the possibility of expressing a given state of a system either 
directly in coordinate language or by a superposition of states in 
momentum language. 

The possibility of superposing individual solutions of the Schroedinger 
squation is somewhat analogous to the possibility of superposing solu- 
tions corresponding to different wave motions in the classical mechanics; 
and the actual solutions of (59.1) often show an explicit trigonometric 
or exponential dependence of ^ on g and t which is reminiscent of 
familiar wave motions. This is in agreement with the frequent use 
of the term wave mechanics as an alternative for quantum mechanics, 
and the practice of speaking of the Schroedinger equation as a 
wave eqvution to be used for calculating the amplitude iff of probability 
waves. 

Nevertheless, the actual functions superimposed in equations such as 
(59.2) have a physical significance quite different from that of functions 
which can be superimposed in the older mechanics. Thus, at a given 
time and represent possible quantum mechanical states of the 
system and the result of their superposition tjs also represents a possible 
state. This will make it possible in the quantum mechanics to treat 
a system in such a state ^ also from the point of view of its bemg 
partly in state and partly in state make predictions as to 

the relative probability after a suitable experimental observation that 
it will actually be found to be in one or the other of these latter states. 
The quantum mechanics thus provides possibilities for the superposition 
and decomposition of states — ^which latter can be made in a variety of 
ways — ^for which the classical mechanics with its perfectly definite 
states provides no analogy. 

Characteristic for the whole structure of the wave mechanics is the 
fact that the wave fonctions ^ which obey the superposition principle 
are not the physically observed things, and that the observable expecta- 
tion values for dynamical variables depend bilinearly on ijs and 0*. 



222 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


Thus the calculation of these expectation values will exhibit inter- 
ference effects, in the sense that the expectation values corresponding 
to the state , ^ / i ^ / 


will not in general be the weighted average of the expectation values 
for the states and For example, considering (59.4), we see in 
predicting the behaviour of a system whose state in coordinate language 
is expressed as a superposition of various states in momentum language, 
that the interference between these states may be all-important when 
this form of expression is used. It is clear that such a system cannot 
be represented as a collection of systems with various but well-defined 
momenta. As especially emphasized by Dirac, this may be regarded 
as a fundamental expression of the complementary character of the 
physical description of a given situation in terms of the concepts of 
position and of momentum. 


60. Energy levels for systems in steady states. Eigenvalues and 
eigenfunctions 

We are now ready to treat the quantum mechanical explanation of 
the existence of discrete energy levels. The actual occurrence of such 
levels was the first of the considerations which we mentioned as leading 
to a necessity for modifying classical mechanics. 

We shall define a quantum mechanical system — Shaving a Hamil- 
tonian not itself dependent on the time — ^as being in a steady state^ 
when both the probability density (68.5) 

W = ^*(g, «) (60.1) 

and the probability current (68.7) 








Hk! wifce 




(60.2) 


are themselves independent of the time. For such a state, bpth the 
chance for finding a given particle at a specified place and for finding 
it moving through a specified area will be independent of the time. 
Hence the designation steady state is appropriate. 

To satisfy the first of the above conditions we must evidently take 
our amplitudes of the form 

and ift = u{q)e-*^, 

where u{q) is a function of the coordinates alone, while f{q, t) may for 
the present be any real function of the g’s and t. 

To satisfy the second of our conditions for a steady state — since d* 



§60 


STEADY STATES 


223 


being part of the Hamiltonian is not an explicit function of t — ^we shall 
then have to have 


0 ?* 


,dJt* ^8u 8u* ^ 8f{q,t) 

= u* u- 2tu*u •' ^ - 


independent of the time. And this will only be true if / is of the form 


= 9{q)+Ht)- 

Hence, absorbing into u(q), we can now write the probability 
amplitude for a steady state in the necessary form 

This expression for the amplitude, however, must of course be a 
solution of the Schxoedinger equation (57.5), and we can obtain further 
information as to the form h{t) by substituting therein. Doing so we 

= o, 

27rt at 


and evidently this can only be satisfied if h{t) is a constant multiplied 
by the time, »(() = eonst.xf. 

Hence we may now write the probability amplitude for a system in 
a steady state in the following general form, which will prove convenient, 

(60.3) 

where E must be a reed constant, in order that shall retain its 
independence of t. Furthermore, by substituting this result back into 
the general expression (57.6) for the Schroedinger equation we now 
obtain, as a special case applicable to steady states, the so-called 
Schroedinger vx/me equation without the iAme 

H«(2) = Eu{q). (60.4) 

With the help of these equations we can now derive some important 
general results for systems in steady states. 

In the first place, substituting (60.3) into our general expression (57.4) 
for the mean value of an observable quantity, we can evidently write 

= J tt*(g')H^(g') dq 

as an expression for the mean value of the »th power of the energy of 
a system in a stationary state. In accordance with (60.4) this will 

8^^® W' = E^ (60.6) 

valid for any integral power «; and this can only be satisfied if the 



224 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


energy of the system has a value exactly equal to our constant E, 
Hence a system in a given steady state is characterized by a precise 
value E which will be found for its energy. 

In the second place it is to be specially noted that not all values of 
the energy may be physically possible, since we can obviously only 
permit such values for the constsmt E in equation (60.4) as will lead 
to solutions for u{q) which allow a sensible interpretation of u*u as 
a probability density. Values of E which make u{q) single-valued, con- 
tinuous, and finite throughout the range of the coordinates q will 
evidently be satisfactory.'j' These allowable values of E may be called 
eigenvalues for the energy of the system, and the associated solutions 
for u[q) eigenf unctions, % Corresponding eigenvalues and eigenfunctions 
are conveniently designated by attaching the same suffix, thus E^ and 

Applying the indicated methods of treatment to the actual problems 
of atomic mechanics, cases both of discrete and of contirmous spectra 
of energy eigenvalues are encountered. These make it possible for the 
quantum mechanics to explain both the discrete energy levels exhibited 
by atoms with energies less than their ionization hmit and the con- 
tinuous changes in energy exhibited above that limit. Furthermore, as 
first shown for the case of the hydrogen atom by Schroedinger,|| the 
quantum mechanics gives a correct prediction of the energy levels 
actually found. 

Eetuming to a consideration of the general results which can be 
obtained for systems in steady states, let us take and E.^ as two 
possible eigenvalues of the energy, and and u^ as the corresponding 
eigenfunctions. In accordance with (60.4) and the corresponding com- 
plex conjugate equation, we can then write 

and {HuJ* = Elul, 

Multiplying the first of these equations by and the second by u^, 
and integrating over the total range of coordinates j, we obtain 

/ = di 

and J dq=E*j dq, 

t For a deeper discussion of the requirements for a satisfactory eigenfunction, see 
Pauli, Havdbv^ der Physik, xxiv/1, second edition, Berlin, 1933, p. 123. 

J The earlier English designations, ‘characteristic values’ and ‘characteristic func- 
tions’, are not so commonly xised in quantum mechanics. 

11 Schroedinger, Ann. der Phys. 79, 361 (1926). 



§60 


ENERGY EIGENVALUES 


225 


where the constants and jK* can evidently be taken outside the 
integral sign. Since the left-hand sides of these equations are equal, 
owing to the Hermitian character (55.25) of the operator H, we can 
now write the useful result 

J = 0. (60.6) 

This equation has some important implications. 

In the first place let us consider the case m = n, where the two 
indices refer to the same eigenvalues and eigenfimctions. Since the 
integral J u* dq will not be zero, owing to the essentially positive 
character of we must then conclude that 

E^ = El (60.7) 

Hence the allowable values for the energy will aU be real, in agreement 
with our earlier conclusion noted when the constant E was first intro- 
duced, and also in agreement with a more general principle by which 
it can be shown that the eigenvalues for any observable quantity must 
be real. (See (64.9).) 

As a second case let us consider the situation m =^n corresponding 
to the different eigenvalues E^ ^ E^. To preserve the truth of equa- 
tion (60.6) we shall then have to have 

= 0. (60.8) 

We thus find that eigenfunctions corresponding to different eigenvalues 
of the energy are necessarily orthogonal to each other. This is a very 
important property of such eigenfunctions which is often made use of 
in the quantum mechanics. 

Finally, let us consider the possible situation m with E^^^^ = E^ 
which arises in so-called cases of degeneracy, where the Schroedinger 
equation (60.4) exhibits the possibility for quite different solutions 
and that correspond nevertheless to eigenvalues E^^ and E^ which 
turn out to be equal. In such a case, if is the number of independent 
solutions, we say that the energy level is g-fold degenerate. Such 
degeneracies can in general be removed by subjecting the system to 
perturbing fields which will differently affect the different eigensolu- 
tions and break up the energy level into g neighbouring levels. 

In the case of degeneracy we can no longer conclude that two eigen- 
functions and belonging to the same energy level are necessarily 
orthogonal, since the previous argument leading to equation (60.8) now 
breaks down. Nevertheless, if we have g independent solutions which 

8595.25 Q g 



226 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


are not orthogonal, we can always form new solutions by linear com- 
bination which will give us an equivalent set of g functions which are 
orthogonal. (See § 64 (c),) 

We shall find that the quantum mechanics often makes use of these 
solutions of the Schroedinger equation (60.4), or energy eigenfunctions, 
which we have been discussing. They are customarily riormalized, so 

ju*u^dq = l. (60.9) 

They are either necessarily orthogonal or may be orthogonalized by the 

process described above, so that 

jv^u^dq=0. (60.10) 

Anfl as a rule they are known or assumed to provide a complete set of 

functions in terms of which other functions f{q) may be expanded in 
accordance with the expression 

/(?) = (60.11) 

k 

where the are constant coefficients, where summation may be re- 
placed by integration in the case of a continuous spectrum of eigen- 
values, and where, as will be explained later, this equation is in general 
to be interpreted in terms of convergence in the mean. 

61. Wave -particle duality. De Broglie waves for free particles 

Let us now turn to a consideration of the quantum mechanical point 
of view with respect to wave-particle duality. The difficulty of under- 
standing the nature of such a duality from a classical point of view 
provided the second of the considerations which we mentioned as 
necessitating a modification of classical mechanics. 

In a general way it can be said that the quantum mechanics is able 
to incorporate the idea of an entity having dual particle-hke and wave- 
hke properties, by restricting the particle-hke properties so that they 
do not permit the description or prediction of an exact space-time 
motion, and by introducing probabihty waves as the actually appro- 
priate apparatus for the making of predictions. The restrictions on 
particle-hke properties are such that we can stiU determine at will the 
position of the entity, for example, by catching it in a small receptacle, 
or its velocity, for example, by the Doppler effect in reflected light, 
but are such that they do not permit a simultaneous knowledge of 
position and velocity sufficient for an exact kinematic description. The 
introduction of superposable waves for calculating the probabihty that 



§61 


DE BROGLIE WAVES 


227 


the entity will appear in a given place carries with it those possibilities 
for interference and reinforcement which have seemed experimentally 
necessary. The limitations on the kinematic description of the particle- 
like behaviour of the entity can be shown in detail just sufficient to 
avoid conflict with the complementary description of its wave-like 
behaviour.f 

It will be instructive to consider an entity — such as an electron or 
proton — ^which we have grown accustomed to regard purely from the 
particle point of view, and examine the nature of the probability waves 
associated with it. If we take the motion as being in free space, where 
the potential energy V (cc, y, z) can be taken as zero, the classical Hamil- 
tonian would have the form 

S = -^{p%+pl+p^), (61.1) 


where m is the mass of the classical particle. This Hamiltonian will 
permit us to write the Schroedinger equation (57.6) in a specially simple 
form if we use the jp-language. We obtain 


which has the simple solution 

(61.2) 

where CL^Px^Py^Pz) arbitrary function of its arguments. This 

expression gives the probability amplitude for finding particular values 
of the momenta py, In this case of free space, however, these 
momenta determine the energy in accordance with the equation 

■^{Pl+Py+Pl) = (®l-3) 


Hence there will be no change in significance if we rewrite our expres- 
sion for this probability amplitude in the simpler form 

We have thus obtamed a simple expression for the probabiliiy ampli- 
tude We shall be specially interested, nevertheless, in the 

probability amplitude expressed in the g-language. Substituting 
(61.4) into the appropriate transformation relation (67.2), this will be 
given by 

= i, JJJ (61.5) 

— 00 

t See Pauli, Handbuch der Physik, xxiv/1, second edition, Berlin, 1933, p, 86. 



22S 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


It can readily be shown, however, that this expression can be regarded 
as resulting from the superposition of plane waves, travelling through 
the space x,y,% with directions, frequencies, and phase relations that 
can be chosen at will. 

To see this let us first introduce the so-caUed de Broglie rdationa, 
connecting the components of momenta and energy with wave numbers 
and frequency. These can be written in the form 


h 




(61.6) 


and may be regarded as defining the new quantities — ^wave numbers 
ffa,, Oy, Og, and frequency v. Substituting these quantities into (61.5), 
this expression can evidently be rewritten in the form 

+ 00 

^{x,y,z,t) = JJJ da^doyda^, (61.7) 

— 00 

where A{aj.,ay,aff is now an arbitrary function of the new arguments. 
And, making use of (61.3) and (61.6), we can write 

(61.8) 


as a necessary relation connecting frequency and wave numbers. Equa- 
tion (61,7) is, however, a well-known form of expression for the result 
of superposing plane waves, and equation (61.8) can be spoken of as 
the law of dispersion for the waves under consideration. 

In accordance with (61.7) the individual waves which are superposed 
to give the total probability amplitude y, z, t) will be of the form 

tjf = (61.9) 

(It should be noted that this expression for i// does not go to zero as 
X, y, and z go to infinity, and must be regarded as a limiting case of 
the quadraticaUy integrable ^’s that we have admitted.) They may be 
called the de Broglie waves associated with a jfree particle. By intro- 
ducing a familiar relation between exponential and trigonometric lan- 
guage, this expression for the de BrogUe waves may be rewritten in 
the form 

tjr = Aoo& 27 T{<T^x+ayy+agZ'-vt)+iAsm 27 T(< 7 ^x+ayy+crgZ—vt). 

(61.10) 

And, by noting the periodicity of the expression with respect to spatial 



229 


§61 


DE BBOGLIE WAVES 


and temporal displacement, it may also be written as 


ijs = A cos 2771 


'a;cosot , ycosjS , gcosy A 


where 


I • o fXGOBOL , yCOsB , ZGOSy t 

X _ cos a _ cosjS _ cosy _ 1 


(61.11) 


(61.12) 


are seen to be the wave-length and period of the wave, and a, jS, and y 
the angles which describe its direction of propagation with respect to 
the X, y, and z axes. 

Some of the properties of these de Broglie waves may now be con- 
sidered. 

In the first place it will be noticed, for example from the form of 
expression (61.10), that the amplitude ^ which is propagated by these 
waves will be a complex quantity for aU non- vanishing values of the 
arguments of the sine and cosine terms and of A, which latter can itself 
in general be a real, imaginary, or complex quantity. This is in agree- 
ment with our earlier mention of the fact that the probability ampli- 
tudes would in general turn out to be complex quantities. 

In the second place, if we consider the Schroedinger equation for our 
present system as given in accordance with (56.6) in the ^-language. 


/ay 8hji e^\ hdtft^ 
}\dx^ dz^j 277% dt 


and substitute the expression (61.9) for an individual de Broglie wave, 
we readily obtain, after cancelling common factors, the result 




in agreement with the necessary relation (61.8) which we have called 
the law of dispersion for the waves in question. We thus see that each 
of the de Broglie waves of the form (61.9), which are superposed in the 
integrated expression (61.7), obeys the Schroedinger wave equation. 
This is, of course, a necessary consequence of our method of develop- 
ment. Furthermore, it will be seen that the full expression (61.7) is 
the most general solution of that wave equation. 

In the third place it will be seen that these de Broglie waves can 
evidently be superposed in such a way as to form ‘wave packets* 
which at any selected time will give a laige probability for finding 
approximately such values for the position and momentum of the 



230 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


particle as we may desire to specify. The nature of such wave packets 
will be more thoroughly considered in the next section. It may be 
noted now, however, that a wave packet which gives a large proba- 
bility for finding specified values of the momenta jp^., and can 
only be obtained by assigning relatively large values to the coefficients 
A(cr^,cTj^,(rjg) in (61.7) in the neighbourhood of the values of the wave 
numbers and that correspond to the momenta which we wish 
to favour. Making use of the known approximate connexion between 
group velocity and dispersion, 


== 


dv 


dv 

d(Ty 


dv 


(61.13) 


and substituting from the dispersion equation (61.8), we then obtain 


m 


ha.. 

£ 

Uy , 

^ m 


rn 


(61.14) 


for the approximate velocity of such a wave packet, where tr^, and 
cTg are the values of the wave numbers that have been favoured in the 
construction of the packet. Substituting the de Broglie relations (61.6), 
this now gives us the satisfactory result 


% = — , Vj, = — . % = — ( 61 . 16 ) 

m ^ m m 

as a connexion between the approximate velocity of the wave packet 
and the probable values which will be found for the components of 
momentum. 

Most important of aU, it will now be appreciated that the de Broglie 
waves provide the possibilities for interference and reinforcement which 
are needed to explain experiments on the diffraction of particles, such 
as those of Davisson and Germer on the preferred directions for the 
reflection of electrons from a crystal grating. Since these effects are 
obtained with electrons whose momentum is first fixed by dropping 
through an appropriate field, the de Broglie waves will be such as to 
give a large probability for finding that value of the momentum 

Hence, in accordance with our previous considerations, noting (61.6) 
and (61.12) in particular, these will be waves whose wave-length will 
satisfy - 



§61 


DE BROGLIE WAVES 


231 


It is experimentally known, however, that the diffraction effects are 
such as would be accounted for by the interference and reinforcement 
of waves having the de Broglie wave-length 


A 



A 

mv’ 


(61.16) 


The foregoing considerations are sufficient to give an idea of the 
quantum mechanical treatment of wave-particle duality in the case of 
entities which were customarily regarded solely from the particle point 
of view in the past. Somewhat si mil ar considerations apply to the 
reverse ease of electromagnetic radiation, which was in the past re- 
garded solely from the wave point of view, but which may now be 
treated from the particle point of view with the help of the concept 
of photons. Nevertheless, as these photons travel with the velocity of 
light, a non-relativistic treatment would not be sufficient, and, as a con- 
sequence of this, one cannot assign a precise value for the position of 
a photon. 


62. The Heisenberg uncertainty relation 

(a) Case of a free particle. Wave packets. We now come to the 
quantum mechanical treatment of the uncertainty relations coimecting 
coordinates and momenta. The necessity for some restriction on the 
possibility of exact knowledge as to the simultaneous values of con- 
jugated coordinates and momenta was the third of the considerations 
which we discussed as necessitating a revision of classical mechanics. 

To treat the problem we may first consider the simple case of a single 
free particle and investigate the extent to which its position and 
momentum can be specified by a properly chosen quantum mechanical 
state. Since the probability amplitude representing any such 

state can be obtained, as was seen in the preceding section, by the 
superposition of de Broglie waves, this means that we must investigate 
the nature of the wave packets which can be built up from these waves 
in such a way as to give an approximate specification of position and 
momentum. 

The individual de Broglie waves to be used in constructing the wave 
packet will be of the form (61.9), namely 

iff = ay, 

If now we desire at some given time to bmld up a packet which gives 
a large probability for finding the particle within a specified range Aa;, 
it is evident that we shall at least have to 'make use of waves with 



232 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 

sufficient difference in their wave numbers Cj. and so that they 

can reinforce in the middle of the range A® and interfere outside. Since 
the number of wave crests per unit distance along the ®-axis for two 
such waves will be equal to and <Tj,+Acra., the condition for such 
reinforcement and interference will then evidently be 

(o-a,+AaJAa:— CTajA® = Affa-A® « 1, 

as this is what is needed to give us one more wave crest in the distance 
A® for one of the waves than the other. 

Hence, considering all three dimensions, we may now write 

AoTjjA® w 1, 

AcTyAy w 1, (62.1) 

Aa^Az 1 

as expressions giviag the ranges in wave numbers Aay, Actj which 
will be needed to construct a wave packet which locates our particle 
in the approximate range A®AyAz. If, however, we have to assign 
appreciable amplitudes to waves having these differences in wave num- 
ber, it is evident from the preceding section that we shall have to assign 
appreciable probability amplitudes for finding momenta that 

differ in accordance with the de Broglie relations (61.6) by amounts of 
the order 

Apa = Mctj,, Apy — TiAcy, Apg = hAcr^. (62.2) 

Combining (62.1) and (62.2), we then see that the quantum mechani- 
cal state, corresponding to a wave packet specially constructed for the 
purpose of specifying the position and momentum of a particle, will 
only do so subject to the Heisenberg 'imcertairity relations 

Apa,A® « h, 

ApyAy « h, (62.3) 

ApgAz » %. 

The formalism of the quantum mechanics is thus devised to agree with 
limitations on classical measurability which are conceptually necessary 
as soon as we consider the distmbing effect that observation itself 
would have on small systems. 

For particles of large mass the uncertainties corresponding to (62.3) 
would be far below the limits of practical precision. Thus for a particle 
having a mass of 1 gram we could write 

AVj.Ax an h= 6-5 X 10“®’, 

where AVg. is the uncertainty in the ®-component of velocity, and hence 



§62 


WAVE PACKETS 


233 


could have a simultaneous specification of position and velocity with 
accuracies of the order of 8x10“^^ centimetres and centimetres per 
second respectively. It is therefore not surprising that the classical 
mechanics, which was devised to describe the behaviour of large masses, 
was able to employ the idea of exact kinematic description. 

The uncertainty relations (62.3) connecting coordinates and momenta 
also imply an uncertainty relation connecting energy and time. If we 
consider for simplicity the motion of a particle of mass m in the x~ 
direction with the approximate velocity v in that direction, we can 


re-express (62.3) in the form 

mAvAx = (mvAv) ^ 
And this result can be written as 


AEAt « A, (62.4) 

where AE denotes the uncertainty in the energy of the particle, and 
the uncertainty as to the time at which this energy is transferred 
from one side to the other of a selected surface. This means, for 
example, if we arrange a shutter for letting particles out from a con- 
tainer, that the time At during which the shutter is kept open and the 
uncertainty AE in the energy of a particle which escapes would be 
connected by the above relation. 

Before turning to a more general consideration of the nature of 
uncertainty relations it will be of interest to point out certain further 
properties of the wave packets which we have used for investigating 
the simple case of a free particle. 

Let us denote by x and the mean values or expectation values 
for the a;-coordinate and a;-component of momentum of the particle as 
obtained by integrating over the whole of the wave packet. We have 
as expressions for these quantities 

iJj'^xiIj dxdydz and Px~ (62.5) 

in accordance with the rules for obtaining expectation values given by 
(57.4). Differentiating these expressions with respect to the time, and 
using Schroedinger’s equation (57.5) to re-express the values of 
and dilf/dt thus introduced, it can readily be shown that we obtain for 
the case of a free pa/rticle 

dt [dpj 

Hh 


dt 



3505.25 


(62.6) 



234 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


these results being a special case of equations (63.9) which we shall 
derive in the next section. For sharply defined wave packets, such as 
can he obtained with particles of large mass, the results given by (62.6) 
approach the classical results which give the velocity of a particle in 
terms of its momentum and assert that for a free particle the momen- 
tum will be constant. 

To mention a further property of wave packets for a free particle, 
it is of interest to consider the change in their dimensions with time. 
To do this let us make our ideas as to degree of uncertainty somewhat 
more precise by defining the mean square uncertainties in the position 
and momentum of the particle by 


(62.7) 


Ao;^ = Jff (a?-“i ») dxdydz 

and ^ = J J| (Pa>-PzN*i> dpy dp^ 

It can then be shownf that the time dependence of the first of these 
quantities is given by 

^ = (Aa:2),=o+2« J J J =o dxdydz + ^ (62. 8) 


while the uncertainty in momentum will remain constant in the case 
of a free particle unacted on by forces. 

At first sight it might seem that equation (62.8) contains nothing 
particularly characteristic of the quantum mechanics, since it would 
also hold for classical particles distributed with the current density 8^ 
and mean-square uncertainty in momentum Ap^. It is to be appre- 
ciated, however, that in such a classical distribution the quantities 
{Ax^)i^q and Ap% could be taken as small as desired, which would not 
be possible in the quantum mechanical interpretation. 

In accordance with (62.8) it is seen that the dimensions of a wave 
packet for a free particle would be a quadratic function of the time, 
increasing without limit after passing through a possible minimum, 
thus making it increasingly inappropriate as time proceeds to attempt 
a classical kinematic description of behaviour. 

(6) General treatment of xmcertainty relations. We may now turn 
to a somewhat mathematical consideration of uncertainty relations 
applicable to more general kinds of systems and observable quantities.^ 


t See, for example, Kemble, The Fundamental Principlea of Quantum Mechanics, New 
York, 1937, equation (33.18), whicbhaisbeenre-expressed above in terms of current density. 

t See Robertson, Phys, Pev. 34, 163 (1929); 35, 667 (1930); 46, 794 (1934); 
Schroedinger, Perl* Ber, 1930, p. 296. 



§62 


UNCERTAINTY RELATIONS 


235 


Let the quantities of interest correspond to the operators F and G. At 
a particular given time we can then write for the mean or expectation 
values of the two quantities 

J = J dq and © = J dq, (62.9) 

where ijs is the probability amplitude for the state, at that time, expressed 
in the g'-language. We may now define the uncertainties A-F and Aff 
in these quantities somewhat more definitely than previously as the 
root mean square of the deviations from the expectation value, in 
accordance with the equations 

and {LQf = ^ ils*{G-Q)'^dq. (62.10) 

As a simplification let us now use the s 3 mibols 

f=F-J' and g = Q-Q (62.11) 

as an abbreviation for those operators; and let us define two new wave 
fmcUonsby ^ ,,2 j 2 , 

Substituting in (62.10), we obtain 

{KFf = I dq = f dq = f dq 

^ (62.13) 

and = J i/r*g ^2 = J = j 

where the second forms of expression come from the Hermitian charac- 
ter of the operators f and g, in accordance with (55.25). 

We are now ready to employ the known Schwarzian inequaUtyf 

J ^ J > J ^ j (62.14) 

Substituting from above, and again making use of (55.25), we obtain 
{AFnAG)^ ^ f if>f>)*&l>dq j (eP)*{tl>dq 

> j ^(gf^)* ^ j ^(f#)* ^ 

^ j" ^ j ^*ig!pdq. (62.15) 

Since in general f and g will not commute, we shall find it useful to 
re-express the right-hand side of this inequality with the help of the 


t See Weyl, Chvjppentheorie und Quantenmechaniki tr. Robertson, London, 1931, p. 393. 





xJtLiii jjiXiiiiviiiixMO L/j? v^UiiiNruiM uriAiN ± v;d L;nap. vii 


method for decomposing products given by (56.34). We can then write 

J J iji'sttli dq = j X 

J[Saf+iS^f, (62,16) 

where, in accordance with (65.29) and (66.31), we now have the mean 
values corresponding to Hermitian operators. 

Introducing (62.16) into (62.16), and also using the relations (62.11) 
by which f and g were defined, we now easily find that om uncertainty 
relation for any two quantities can be expressed in the general form 

(AJ?’)2(A(?)2 > (62.17) 


Fixing our attention on the right-hand side of this expression, we see 
that the product of the two xmcertainties would at least have to be 
large enough to correspond to the second of the two terms given, in 
the case of two operators which do not commute. It should be noted 
that the limit of the complementary uncertainty itself depends in 
general on the state considered, since expectation values appear on the 
right in (62.17). 

As a special example of the applicability of (62.17), we may now 
consider the case of conjugated canonical coordinates and momenta 
having the commutator fc/27ri given by (55.38). We then obtain fifom 

(62.17) , 

(62.18) 

as an expression of the Heisenberg uncertainty relation, which is now 
applicable to conjugate canonical coordiaates and momenta in general, 
and is also somewhat more definite than our original expression (62.3) 
for the case of a free particle. 

In addition to the uncertainty relations connecting coordinates and 
momenta, we also had for free particles an uncertainty relation between 
energy and time (62.4) which can now be written somewhat more 
definitely in the form t 

(62,19) 

477 


!For the case of a free particle, this relation connects the uncertainty 
AE in the energy of the particle with the uncertainty At in the time 




§62 


UNCEBTAINTY RELATIONS 


237 


at which, the particle passes through a given boundary. In the case of 
the more general systems which now interest us, since a change in their 
energy content could he brought about by the impingement of particles, 
it win be natural to retain this same relation, with AE now denoting 
the uncertainty in any amount of energy E which is transferred to the 
system, and At the imcertainty in the time t at which the transfer takes 
place. 

This new form of the principle would also have a bearing on the 
accuracy with which the total energy of a system could be regarded as 
determined. To measure this total energy we should have to allow for 
the effects of interaction between the system and the apparatus of 
measurement, and with a time At available for the measurement we 
should at least have an uncertainty AE in the result of the order given 
by (62.19), since this would express our lack of knowledge as to the 
transfer between system and measuring apparatus. Conversely, in 
order to fix the time of occurrence of an event within a system with 
a precision At in terms of a clock external thereto, an imcontroUable 
exchange of energy AE with the system of the order given by (62.19) 
would be necessary. 

63. Correspondence between classical and quantum mechanical 

results 

{a) Change in expectation veilues with time. We have now completed 
our account of the quantum mechanical point of view with regard 
to those problems connected with energy levels, wave-particle duality, 
and uncertainty which seemed so difficult from a classical stand- 
point. It remains to study the very extensive correspondence which 
relates quantum mechanical with classical results. Among other 
things this wiE be helpful in explaining and justifying the use of 
the classical mechanics under those limiting conditions where it remains 
applicable. 

To investigate this matter it will fiirst of aU be advantageous to 
obtain a general expression for the rate of change with time in the 
expectation value for any observable quantity. This wfil then make 
it possible to compare quantum mechanical expressions for the rate 
of change with time in the expectation values of observable quanti- 
ties with classical expressions for the rate of charige in their precise 
values. 

In accordance with our fundamental expression (67.4), the average 
or expectation value for any function F{q,p) of the coordinates and 



238 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


momenta can be calculated from a knowledge of the corresponding 
operator F(q,p) with the help of the equation 

J = (63.1) 

Considering for simplicity cases where F is not itself an explicit 
function of the time, we then obtain by differentiation 

= ^j [(H^)*F^-^*F(H^)] dq 
= ^ j dq 

= ^ j (^*(HF-FH)^ dq. (63.2) 

The first form in which the equation is written is a direct result of 
differentiation; the second form containing the Hamiltonian operator 
H comes by substituting for 8tj>*ldt and d^jdt with the help of the 
Schroedinger equation (67.5); the third form results from the definition 
of Hermitian operators (65.25); and the last form results from the 
derived property of Hermitian operators (56.26). 

Introducing our previous abbreviation (65.32) for the commutator for 
two operators, we can also write (63.2) in the forms 


= r[^,H]^dq = (63.3) 

where \F, H] is the expectation value for the quantity that corresponds 
to the operator [F, H]. Since we have already seen (55.31) that i times 
the commutator for two Hermitian operators is itself Hermitian, it is 
evident that the rate of change of F with time will be a real quantity, 
in satisfactory agreement with the reality of F itself, as given by 
(56.47). 

Comparing the quantum mechanical equation (63.3) for the rate of 
change in the expectation, vaiue of F{q,p), with the classical equation 
(11.4) for the rate of change in the exact value of F{q,p), we also see 


that the quantity 


may be regarded in the present con- 


nexion as the quantum mechanical analogue of the classical Poisson 



§63 


CLASSICAL ANALOGUES 


239 


bracket H). This is the first of several examples — compare (67.27) 
and (81.21) — showing a close relation between the classical Poisson 
bracket {M, N) for two quantities and the quantum mechanical opera- 

27T 

tor -rj- [M, N] which depends on the commutator for the two operators 


that correspond to those quantities. This relationship is made to play 
a fundamental role in the Dirac method of developiog the quantum 
mechanics, t 

(6) The analogue of the Hamiltonian equations of motion. We shall 
first apply the result given by (63.3) to the calculation of the rates of 
change in the expectation values for the coordinates and momenta 
themselves. 

To do this let us first consider the character of the commutator 
[H, q;^.]. We shall assume that the energy operator is given in accordance 
with the discussion of § 55 (/) as a sum of symmetrized terms of the form 

H = 2 iC^(QP+PQ). (63.4) 

where Q and P are operators which are dependent on the q’s and p’s 
separately. For the commutator in question we can then write 

[H,qfc] = 2^^[(QP+PQ).qfc] 

= 1 iC^{Q[P,qft]+[Q. q^JP+PLQ. <ik]+[P, q^lQ) 

= 2 iC^{Q[P. q*]+[P. <hc]Q}> (63.5) 

where the justification for the various forms of writing is either evident 
or provided by the fundamental properties of commutators previously 
given by equations (55.33). 

For the commutator [P, q*], however, we can obtain, with the help 
of an intermediate expression in the ^-language, 

[P.qfc] = Pqft-qfcP 



27ri 


(63.6) 


Substituting into (63.5), we then obtain, with the help of the original 
expression (63.4) for the Hamiltonian itself. 






■6pife 


^o'\ — — — 


(63.7) 


t Dirac, The Frindplea of Qwmtum Mechanica, second edition, Oxford, 1935. 



240 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VU 


rurthermore, it is readily seen that analogous expressions can be 
obtained for the momenta. The important results thus obtained are 
summarized by the equations 


and 


FH n 1 _ ^ 




h m 

27ri dcijJ 


(63.8) 


These equations may now be substituted in our general expression 
(63.3) for the rate of change in expectation values with time, to give 
the desired results 



(63.9) 


We thus obtain the quantum mechanical analogues of the equations of 
motion in the Hamiltonian form. They show that the expectation values 
of the coordinates and momenta of a quantum mechanical system 
depend on the expectation values of partial derivatives of the energy 
in the same way that the precise values of the coordinates and momenta 
of a classical system depend on the precise values of those partial 
derivatives.f 

As a result of this finding we now see that the classical mechanics 
may be regarded as the limiting form assumed by the quantum 
mechanics when the dispersion around the expectation values for the 
coordinates and momenta can be neglected. Since this dispersion, in 
accordance with the Heisenberg uncertainty relations, is due to the 
finite size of A, we can also regard the classical mechanics as the limiting 
form approached by the quantum mechanics as h goes to zero. We 
shall investigate the nature of the process by which the approach to 
the limit takes place somewhat more thoroughly later in the present 
section. 

(c) The conservation of energy in quantum mechanics. Returning to 
our fundamental equation for the rate of change in the expectation 
value of any function of the coordinates and momenta, 

W = X J ^ ^ (63.10) 


it will also be profitable to apply this equation to the case of the energy 
itself, where we take F = H. 


t See Ehzenfest, Z&Ue.f. Phya, 45, 456 (1927). 



§63 CONSEBVATION OF ENERGY 241 

Since any operator commutes with itself, we at once obtain the result 

^ = 0. (63.11) 

Furthermore, since H will commute with any power of itself, we shall 
also have 

for the time dependence of the expectation value of any integral power 
n of the energy. 

These results, which apply, of course, onlj^' to conservative systems 
where H is not an explicit function of the time, show that both the 
mean value of the energy and the rdative probabilities of finding different 
particular values for the energy of an undisturbed system will not 
change with the time. These conclusions may be regarded as the 
quantum mechanical analogue of the principle of the conservation of 
energy. 

For the special case when the state of the system at some initial time 
is such that there is unit probability of finding only a particular value 
of the energy and zero probability for finding others, it is evident from 
the above that unit probability for finding that particular value of the 
energy will be retained until the time of the next measurement. We 
can then use the idea of the conservation of energy in the simple sense 
of the classical theory. 

(d) The conservation of momentum in quantum mechanics. We may 
also apply our present methods of investigation to the total linear 
momentum of a system. Let us consider a system of particles and let 

and p^ be the Cartesian coordinates and momenta of the individual 
particles. For the total momentum, in the rr-direction for example, we 
can then write 

by summing the a:-eomponents of momenta for all the particles 
& = 1 ,..., n. 

For the commutator of the operators corresponding to the energy 
and to this component of the total momentum we shall have 

[h.pJ = |[h.pJ = -A2S- 

where the last form results from the expression (63.8) already obtained 
for the commutator of energy with any individual component of 

3595.25 T i 






(63.13) 



242 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


momentum. For an isolated system, however, in which the Hamil- 
tonian is dependent only on the rdative position of the particles, we 
shall evidently have nu 

yS = o, (63.16) 

k * 


just as in the classical mechanics. (See § 13.) Hence the commutator 
(63.14) will be equal to zero, and, in accordance with (63.10), we then 


obtain 



(63.16) 


for the time dependence of the mean value of the total momentum of 
the system in the a:-direction. 

Furthermore, the commutator for any power n of the momentum 
could be expanded in the form 

[H, P»] = [H, PjPS-iH-P^[H, PS-"], (63.17) 

and hence by a continuation of such steps into a succession of terms 
each containing the commutator [H, Pj which we have seen to be equal 
to zero for an isolated system. Hence we also obtain 


dP^ 


dt 


0 , 


(63.18) 


for the time rate of change of the expectation value of any power of 
the total momentum in the a;-direotion of an isolated system. 

Equations (63.16) and (63.18) provide the quantum mechanical ana- 
logue of the principle of the conservation of niomentum. Just as in the 
case of energy, we see for an isolated system which is known at some 
initial time to have a particular value of the momentum, we shall have 
conservation of momentum in the simple classical sense up to the time 
of the next measurement. 

We can also treat the total angvdar morneMum of a system around 
a selected axis. Let us consider a system of particles, for simplicity in 
the absence of any magnetic field, and let us use Cartesian coordinates 
Xjf, y^, Zjc and the corresponding momenta p^, Py^., Pg^ for the different 
particles h = 1 ,..., n of the system. Taking the angular momentum of 
the system around the z-axis as an example, we shall have as the 
operator corresponding thereto 

~ ^ 7kT?xk)> (63.19) 

where the summation is taken over all the n particles of the system, 
and no symmetrization of the operator has been necessary owing to 



§63 


CONSEBVATION OF MOMENTUM 


243 


the commutability of non-conjugated coordinates and momenta. For 
the commutator of this operator with ,H, we have a sum of terms 

[H,MJ = |{[H,Xfcp„;,]-[H,y;,p^]}. (63.20) 

And treating the individual terms by our general rules (65.33) for 
manipulating commutators, and substituting from the fundamental 
expressions (63.8) for the commutators of energy with individual co- 
ordinates and momenta, we obtain results of the form 

[H,xpJ-[H,ypJ = [H,x]pj,-|-x[H,pJ-[H,y]p^-y[H,pJ 


h (dH dH dH , an) 


The first and third terms on the right-hand side of this expression will 
cancel, however, for the case of a particle where the momenta enter 
the Hamiltonian as a sum of squared terms (pl+J’^+i’D- Hence, by 
substitution into our general expression (63.10) for the rate of change 
in expectation values, we now obtain the desired result 


dM^ sr/ dff] 


(63.22) 


Just as in the classical mechanics, however, this result will be equal to 
zero for an isolated system in which the Hamiltonian is invariant to a 
rotation of axes. (See § 13.) We can, of course, also treat the time rate 
of change in the expectation value for any power of the angular 
momentum. We thus obtain the quantum mechanical analogue of the 
prmciple of the conservation of angvkvr momenMm. 

(e) Approach of quantum mechanical behaviour to the classical limit. 
We have already emphasized, in coimexion with equations (63.9), that 
the classical mechanics could be regarded as a limiting case approached 
by the quantum mechanics as A goes to zero. We may now amplify 
our preceding treatment by considering somewhat more in detaal the 
nature of the approach to the limit.f For purposes of illustration it 
will be sufficient to consider the behaviour of a single particle of mass 
m, with Cartesian coordinates denoted by qj^ (A = 1, 2, 3), moving in 
a potential field 7(2*). 

Making use of coordinate language, we shall find it convenient to 
express the probability amplitude for this system in the quite general 


possible form ^ 


(63.23) 


t The method is due to Wentzel, Zeita, /. Phya, 38, 5 (1926), and Brillouin, Comptea 
Bend, 183, 24 (1926). 



244 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VH 


where A and W are real functions of the coordinates and time. Sub- 
stituting this expression into the Schroedinger equation (56.6) for such 


a particle 


^ STrhn 
k 


dql ™ 2mi dt 


and cancelling the factor we obtain 

2 h^ (8^A 4^dA8W 477 ^ . (dWy 2m . dW\ 
A dq^dqk A* \8qJ h dql] 


27n dt dt 


(63.24) 


Separating the real and imaginary parts, this then leads to two equa- 
tions which can be written — ^afber some simplification — ^in the forms 


and 


V i !!£U74- = 0 

^2ml\0gJ Arr^Adqir ^ U 

■ I V AJ— = 0 

m '^^dq^mdqj,} 


(63.25) 


These equations have been obtained without approximation and pro- 
vide the means for an exact description of the quantum mechanical 
behaviour of the particle. The second of the two equations relates the 
rate of change in probability density {A^) to the divergence of the 
appropriate expression for probability current. 

We must now examine the effect on these equations of letting h go 
to zero. The first of the two equations (63.25) can then be written in 


the form 


TT' 1 Id 


dW 




(63.26) 


And comparing with our previous classical equation (16.1), as presented 
in Chapter II, we see at our present stage of approximation that the 
solution of the above equation 

W = F'(gj.,a*,0+oonst., (63.27) 


where the aiic are constants of integration, would be Hamilton’s prin- 
cipal fwmtion 17 for the classical motion of a particle of mass m in the 
potential field V. Hence, if we now consider for particles in the poten- 
tial field V the classical trajectories that would correspond to a parti- 
cular selection of the constants, we can write 


dW 

— ===Pk = mqk> 
Hk 


(63.28) 



§63 


APPROACH TO CLASSICAL LIMIT 


245 


in accordance with. (16.13), where the q^B, regarded as functions of 
the qic’a and t, now give the velocity with which a classical particle 
would be moving at any point y*, along such trajectories. 

Betuming now to the second of the two equations (63.26), and sub- 
stituting (63.28), we can then write 


which shows that the probability density [A^) would depend on position 
and time in the same manner as the density of a cloud of representative 
points moving along classical trajectories for particles of mass m in the 
potential field F. In the classical mechanics, however, it would be 
possible to consider the motion of a little region, containing such points, 
which could be chosen as small as desired. Hence, by letting h go to 
zero, we should approach the possibility of constructing a wave packet 
^ving unit probability for finding a particle as time proceeds within 
as small a region as desired which would itself move along a classical 
trajectory. 

To investigate the range of validity for such a method of treat- 
ment, we note, in accordance with the first of equations (63.25), 
that the treatment depends on an approximation which is justified 
only when 


CAdql ^ 2 . 


(63.30) 


Or noting that the original form of solution (63.23) actually has 
a periodic character corresponding to an approximate wave-length 
A, and a corresponding approximate momentum p, which would be 
given by 




(63.31) 


we see that the approximation would only be justified with 

2 1 8^A ^ 4ir® 

„ aM 


(63.32) 


We thus see that the construction of small wave packets with A fairly 
permanently concentrated in the neighbourhood of a classical trajectory 
win in any case only be possible for particles of high momentum and 
the corresponding small wave-lengths. 



246 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 

D. FURTHER DEVELOPMENT OF QUANTTOI MECHANICAL METHODS. 

TRANSFORMATION THEORY 

64. Characteristic states. Eigenvalues and eigenfunctions in 

general 

The foregoing completes onr account of the quantum theoretical 
explanation of those changes away from classical ideas which have 
seemed empirically or conceptually necessary, and also completes our 
discussion of the limiting relations still preserved between classical and 
quantum mechanics. We must now consider some further general de- 
velopments of quantum mechanical methods that can be obtained from 
the postulatory basis already introduced. We shall then be ready in 
the next chapter to undertake applications of quantum mechanics to 
specific problems, which will be necessary for our later statistical con- 
siderations. 

(a) Equation determining a characteristic state. In the present section 
we shall study the possibility of specifying the state of a quantum 
mechanical system, at any time of interest, in such a way that some 
selected observable dynamical quantity has a precise value. Such a 
state may be called a characteristic state for the observable quantity in 
question. 

To carry out the investigation, let us take F{q,p) as the observable 
quantity which is to exhibit a precise value (eigenvalue) at the time 
of interest tQ\ and let us make use of a coordinate representation so 
that the state of the system at that time can be specified in accordance 
with the foregoing by a function of the coordinates 

«(2i - ?/) = h) (64.1) 

which gives the probability amplitude t) at the time tQ. Using this 
notation, it will then be found that the equation 

F«(ji ... qf) = - ff/). (64.2) 

where F is the quantum mechanical operator corresponding to the 
observable F{q,p), will determine the form of the probability amplitude 
u{q) necessary to specify a state such that this observable will exhibit 
the precise value F^. 

To prove this principle we note in the first place that the mean value 
of the ?ith power of the quantity F{q,p) would in any case be given 
in accordance with the quantum mechanics by an equation having the 
form of (67.4), namely 

F”' = j u*(q)'F”u{q) dq, 


(64.3) 



§64 


CHARACTERISTIC STATES 


247 


where u{g) is the probability amplitude for the system at the time of 
interest. And hence, if we actually specify the state of the system by 
a function tt(g') which is a solution of (64.2), we see that we shall then 
be able to write 

= J u*{q)F^u{q) dq 

= Ff J u*(qyu(q) dq (64.4) 

= F^e, 


where the second form of expression is justified by the fact that F^ is 
merely a number, and the third form by the consideration that the 
integrated probability for finding the coordinates somewhere in their 
possible range must be equal to unity. This result shows, however, that 
the mean value of any integral power of jf is equal to the same power 
of the specified quantity F^, which can only be realized if F itself has 
the precise value F = F (64 6) 

as was to be proved. 

Equation (64.2) is hence a very important one, since it can be used 
to study the form of the probability amplitude when any chosen 
observable quantity has any one of the different precise values which 
it can assume. Indeed, startmg from a different point of view, we have 
already discussed in § 60 the important equation (60.4) 


Htt(g) = Fu(q), (64.6) 

which we now see to be the special form assumed by the general equa- 
tion (64.2) for the case when the energy of the system is the quantity 
which has a definite specified value. 

{b) Eigenvalues and eigenfunctions corresponding to characteristic 
states. Bretuming to the more general expression 

Fu{qi ... q,) = F^qi ... ?,), (64.7) 

it will now be profitable to consider several implications of this equa- 
tion for determining the forms of the probability amplitude u(q) which 
give characteristic states for the observable quantity F{q,p). 

In the first place it is evident that any solutions of (64.7) which 
have physical interest must give expressions for the probability ampli- 
tude u{q) such that n*{q)u(q) can actually be interpreted as a probability 
density. This will be the case if u{q) is a single-valued, continuous, and 
finite function of the coordinates q over their range of significant 



248 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


values. f It is evident, however, that such allowable solutions for u{q) 
may in general be possible only for very special values of F^. 

Hence equation (64.7) now permits us to determine those values of 
F(q,'p) which are physically possible. These may be called the eigen- 
values of this quantity. Having determined what these possible eigen- 
values are, we may then speak, as w^as previously done in the special 
case of energy, of the spectrum of possible eigenvalues for any observable 
quantity. Such spectra can, of course, turn out to be discrete, con- 
tinuous, or partly both as the case may be. 

Corresponding to each eigenvalue F^, there will be one or more in- 
dependent solutions of (64.7) for the form of u[q), and these may be 
called the eigenfunctions corresponding to the different possible eigen- 
values. In case there is only one such independent solution, we may 
call the state hion-degenerate, and may label the solution with an appro- 
priate subscript, e.g. UjJiq), to indicate the eigenvalue to which it corre- 
sponds. In ease we find it possible to obtain more than one independent 
allowable solution corresponding to a given eigenvalue J?*, we shall say 
that the state is degenerate in the same way that we have previously 
spoken of degenerate energy levels. Under these circumstances we may 
either label the different eigenfunctions with single subscripts which do 
not try to specify the different eigenvalues to which the function may 
correspond, or we may use pairs of subscripts, as, for example, %j(?), 
which would denote the Zth independent solution corresponding to the 
Arth eigenvalue. 

(c) Properties of eigenvalues and eigenfunctions. The properties of 
these eigenvalues and eigenfunctions may now be studied, using similar 
methods to those employed in the latter part of § 60 for studying the 
eigenvalues and eigenfunctions for the particular case of states charac- 
teristic of the energy. It will be profitable to carry out such a study, 
treating degenerate states somewhat more in detail than in § 60. 

In accordance with our general equation (64.7) for a characteristic 
state, and the complex conjugate equation corresponding to it, we may 


write 




and (F%)* = 

as equations applying to the eigenvalues and eigenfunctions indicated. 
Multiplying the first of these equations by and the second by Uy. and 


t For a discussion of the somewhat less restricted requirements actually necessary 
for a satisfactory solution, see Pauli, Handbuoh d&r Phyaik, xziv/1, second edition, 
Berlin, 1933, p. 123. 



§64 EIGENVALUES AND EIGENFUNCTIONS 249 

integrating over the whole range of the coordinates q, we obtain 

J Fttfc ufuj, dq 

and J Uf.(Fuj)* dq = Ff j ufu^ dq. 

Since the operator F is Henoitian, this then leads, in accordance with 
the definition of Hermitian operators (55.26), to the usefol result 

(Fk—Ff) J dq = 0. (64.8) 

With the help of this equation we may now obtain some important 
conclusions. 

In the first place let us consider the case j = k, where the subscripts 
refer to the same eigenfunctions. On account of the essentially positive 
character of u^Uj. = we then obtain 

Frc = n (64.9) 

and must conclude that aU. possible eigenvalues are real quantities, in 
agreement with our previous finding (55.47) that the expectation values 
of observable quantities are always real. 

As a second possibility let us consider the case j ^ k, corresponding 
to the different eigenvalues ^ To preserve the truth of equation 
(64.8) we must then have 

J ufuj, dg = 0 (64.10) 

and are led to the important conclusion that the eigenfunctions corre- 
sponding to different eigenvalues are necessarily orthogonal. 

Finally we must consider the possibility j ^ k but F^ = JJ., which 
would arise in the case of a degenerate state having more than one 
independent eigensolution, i.e. both % and u^ corresponding to the 
same eigenvalue Fj^.. Under these circumstances it would no longer be 
possible to conclude from (64.8) that these different eigenfunctions were 
necessarily orthogonal. We must now give special consideration to the 
eigenfunctions corresponding to degenerate states. 

Considering a particular eigenvalue Jjj., we shall call the corresponding 
state gr-fold degenerate in case we can find g essentially different solu- 
tions Uji^{q) (Z = 1, 2 ,..., g) of equation (64.7) corresponding to this 
eigenvalue. As the criterion of essential difference or linear independence 
of the different solutions, Uj^i, Ujcg, we shall require that no linear 

relation can be found of the form 

~ (64.11) 

where the a’s are constant coefficients, which for arbitrary values of 

3595.25 



250 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 

the independent variables q would permit us to solve for one of the 
functions in terms of the others. Having found such a set of in- 
dependent functions, it will then be possible, however, to express any 
further possible solution of (64.7), corresponding to as a linear com- 
bination of the members of this set. 

As already noted, the independent eigenfunctions composing the set 
would not necessarily be orthogonal since they correspond to the same 
eigenvalue -f*. Nevertheless, it will always be possible — and indeed in 
an infinite number of ways — to construct therefrom a satisfactorily 
equivalent set the members of which will be orthogonal. To do this 
we may, for example, take Uj.j^ itself as the first member of the new set 

As the second member of the set we may then take 

where the constant a may evidently be chosen in such a way as to 
secure the desired orthogonality 

/ = J dq + aj dq = 0. 

Continuing, we may then take the next member of the set as 

= ^*8+^%l+y®A:2> 

where (8 and y are chosen to secure orthogonality of with both the 
previous eigenfunctions vfjgi and In this way we can evidently pro- 
ceed until we obtain a new set of g iudependent eigenfunctions which 
are orthogonal and are equivalent to the original set for the purpose 
of forming linear combinations to express the different solutions of 
(64.7) that correspond to the eigenvalue J* of the degenerate state in 
question. 

As a result of the foregoing discussion we now see that our equation 
(64.7) for the states characteristic of any observable quantity F will 
provide a series of independent eigenfunctions ujq) in terms of which 
we can express any solution characteristic of that observable. These 
eigenfunctions wiU either be necessarily orthogonal, 

/ dq = 0, (64.12) 

or can be orthogonaJized by the method described above. Ihirther- 
more, they can evidently be multiplied by suitable factors so as to be 
normalized in accordance with the equation 

J dq = 1 . 


(64.13) 



§64 EIGENVALUES AND EIGENFUNCTIONS 261 

In later sections we shall study the important use made of such eigen- 
functions in the quantum mechanics in obtaining expansions for other 
more general functions of the coordinates. 

(d) States characteristic of more than one observable. We must now 
inquire into the possibiKty of specifying states in such a way that more 
than one kind of observable quantity, corresponding, say, to the dif- 
ferent operators F and G, should simultaneously exhibit precisely 
defined values. 

In order to achieve this, we at once see from our previous general 
treatment of uncertainty relations given in § 62 (6) that it will in any 
case be necessary for the expectation value of the commutator of the 
two operators to vanish, 

(FG-QF) = 0 , 

since otherwise the two quantities would be subject to uncertainties 
A.F and AG which, in accordance with (62.17), could not simultaneously 
go to zero. This necessary condition wiU, of course, always be satisfied 
if the two operators commute, 

FG = GF. (64.14) 

We shall now be able to show, moreover, that such commutability will 
actually be a sufficient condition for obtaining states such that F and 
G will both be precisely defined. 

To investigate this let us consider the equation 

Fu^iq) = (Z = 1, 2,..., g), (64.15) 

which would determine, in the case of g^-fold degeneracy, the Zth inde- 
pendent eigenfunction corresponding to the eigenvalue jpj.. Applying 
the operator G to this equation and making use of the commutability 
(64.14), we then obtain 

F(G««) = JL(G«h). (64.16) 

From this result we see that Gt£j^{must itself express a state characteristic 
of the eigenvalue JJ., and hence can be written as a linear combination 

= 2 Gnd^km = 1. 2,...,g) (64.17) 

m 

of the independent eigenfimctions Ujc^ corresponding to that eigen- 
value. 

In the spedal case of no degeneracy, (64.17) then solves the problem, 
since it shows that the single Uj,^ wiU be an eigenfunction of G as well 
as of F, and the single G,^ an eigenvalue of G. 

In the general case, the g® quantities G^ occurring in (64.17) will be 



252 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


constants whose values could evidently be determined, making use of 
the normalization and orthogonality of the eigenfunctions from 
expressions of the form 

(64.18) 

As a result of (64.18) and the Hermitian character of the operator G, 
we then see that the quantities are the components of an Hermitian 
matrix. 

This now makes it possible to apply a known theorem concerning 
Hermitian forms.f In accordance with this theorem, if we have a set 
of g independent, normalized, orthogonal eigenfunctions ^/(g), it will be 
possible to change to a new set of similar, independent, normalized, 
orthogonal eigenfunctions v^J^q) such that a general Hermitian form in 
the original eigenfunctions will be connected with a simplified Hermi- 
tian form in the new eigenfunctions by the equation 

T uf G„iU„, = 2 (64. 19) 

Zjm n 

where the g quantities 0^ are real constants, and the equations con- 
necting the old and new eigenfunctions are those for a unitary trans- 
formation (see § 67 (e)) 

= and = (64.20) 

Z n 

■with the components of the transformation matrices subject to the 
relations 

^mn ~ f ~ ~ f dj, 

J J (64.21) 

To apply this theorem we return to (64.17) and for simplicity tem- 
porarily drop the index k which labels all the eigenfunctions involved. 
Multiplying (64.17) by and summing over I, we then obtain 

'^ufGui = 

which gives us, in accordance ■with (64.19), 

X n 

and this in turn can be rewritten, in accordance -with (64.20) and 
(64.21), in the forms 

= 2StnStv*Gvr = X8^v*Gv, 
hn,r n,r 

= K^n = I,<GnVn’ 

n n 

+ See Weyl, OruppentJieorie und Quantenmechanikt tr. Eobertson, London, 1931, p. 21. 



§64 STATES CHARACTEBISTIC OF TWO OR MOBE OBSERVABLES 253 

Taking cognizance of the independent character of the functions v^, the 
last two members of this series of equations then lead to the desired 
final expression, in which the common index h has again been intro- 
duced, = {n = l,2,...,g). (64.22) 

In accordance with this result and our fundamental expression for 
characteristic states (64.7), we now see that our new functions 
describe states of the system such that the quantity G wiU exhibit one 
or another of the precise values G^. Furthermore, since the are 
themselves expressible in accordance with (64.20) as linear combina- 
tions of the original we see that the states will also be such 
that the system exhibits the precise value We have hence given 
the desired demonstration that states simultaneously characteristic of 
different observable quantities can be found, provided only that the 
operators corresponding to these observables commute. 

The above possibility of obtaining characteristic states can, of comse, 
be extended to more than two observables, provided all the corre- 
sponding operators commute. For example, a state such as VJ^J^ con- 
sidered above, which is characterized by the specific eigenvalues and 
G^, may itself still be a degenerate state, and a further application of 
the foregoing methods of treatment may make it possible to introduce 
new functions characterized by eigenvalues and of essen- 

tially different observable quantities F, (?, and H. In the case of a 
system of / degrees of freedom, / would be the maximum number of 
actually independent observables which could be so employed. 

[e) Eigenvalues and eigenfunctions for the coordinates and momenta. 
Included among the general possibilities for states characteristic of 
different observable quantities we have the special cases of states 
characteristic of a coordinate or of a momentum. The eigenvalues and 
eigenfunctions then present some special properties which deserve con- 
sideration. It will be sufficient to illustrate these properties for a one- 
dimensional problem with q and p as the canonical coordinate and 
conjugate momentum. 

If we are using coordinate language, the operator q will reduce to q 
itself and our general equation (64.7) for a state characteristic of this 
quantity will reduce to the simple form 

?«(?) = &“(?). (64.23) 

where is an eigenvalue, i.e. some particular observable value for the 
coordinate g. As the solution of this equation we may tahe the improi>er 
fi™ction ^ 8(2- j,), (64.24) 



234 


THE ELEMENTS OP QUANTUM MECHANICS Chap. VII 


to which Dirac has given the name delta function, and which is usually 
normalized by the condition 


/ Hq-ie) dq = l. 


(64.25) 


It has the important property that for any continuous function f{q) 
we have 

jfmq-qe)dq=f{qe)- (64.26) 

Turning now to solutions characteristic of the momentum, the 

}i d 

operator p in coordinate language will have the form — — , and our 

27rz 

general equation (64.7) for a state characteristic of this quantity will 
have the form i 

= (64..,) 

where Pg is some eigenvalue for the momentum. As the solution of this 
equation we can take _ g(awi/ft)p ,2 (64.28) 

In evaluating expectation values from (64.24) and (64.28) by the 
formula _ 

= J u(q)Fu{q) dq (64.29) 


one must approximate to the singular eigenfunctions, in the one case 
by normalized packets more and more concentrated about the point 
qg, in the other by normalized packets corresponding to sharper and 
sharper momentum definition at the value Pg, and must carry out the 
integration (64.29) before passing to the limit. 

Thus also in the somewhat special cases of coordinates and momenta, 
with their continuous ranges of eigenvalues, we can obtain formally 
appropriate characteristic state solutions. 


65. Expansions in terms of eigenfunctions 

The preceding section has discussed the methods of obtaining eigen- 
functions u{q), for any quantum mechanical system, which would 
describe states such that observable quantities exhibit precisely defined 
values. With the help of these methods we can construct complete sets 
of eigenfunctions which axe very important in the quantum mechanics. 
Such sets may be composed of independent eigenfunctions selected to 
correspond to aU the possible eigenvalues of some one particular observ- 
able quantity F, or, if desired, they may be composed — ^in the case 
of systems of more than one degree of freedom— of independent eigen- 



§65 


EXPANSIONS IN TERMS OF EIGENFUNCTIONS 


255 


functions selected to correspond simultaneously to the eigenvalues of 
more than one observable quantity F, O, etc. 

In the case of such sets of eigenfunctions, different notations may 
be used to distinguish one member of the set from another. The simplest 
notation is to regard each member of the set as labelled with a specific 
mdex. Thus we may write 

^nW 


to denote the nth eigenfunction of the set, or may also write this in 

u{n,q) 


to emphasize that the value of u depends both on the index n and on 
the coordinates q. In the ease of a system of one degree of freedom, the 
index n could be regarded as a number assuming integral or continuous 
values according as the corresponding eigenvalues gave a discrete or 
continuous spectrum. In the case of / degrees of freedom the index n 
coxdd be regarded as a collection of / such numbers. We shall for the 
most part use this simple notation, although sometimes it is profitable 
to use more complicated notations, such, for example, as U]^{q) in the 
case of a degenerate state to denote the Zth eigenfunction of the set 
that corresponds to an eigenvalue Jj,, or u{F,0,...,q) in the case of 
a state characteristic of more than one observable to indicate the 
dependence on the eigenvalues F, (?,... . 

In accordance with the discussion of the preceding section we shall 
always regard the eigenfunctions of such sets as normalized and ortho- 
gonal in agreement with the expression 

J dq = 8^^. (65. 1) 

Such sets of normalized orthogonal functions are often used in the 
quantum mechanics for the purpose of expressing some selected 
function of the coordinates, say /(g), as an expansion into a series of 
the form 


/(g) = Ci%(?)+<52«2(2)+-+Cft«n(2)+- = J (65-2) 

where the Cj. are constant coefficients which may be real, imaginary, 
or complex. 

When such an expansion is possible, the value of any coefficient c„ can 
easily be expressed by multiplying (66.2) through by the ftmction u*{q) 
and integrating over the full range of the coordinates g. We thus obtain 

J <(3)/(2) dq = '^Cj,j <(g)%(g) dq. 



256 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VH 


which, in accordance with the normalization and orthogonality ex- 
pressed by (65.1), gives us the simple result 

«» = / <(?)/(?) (65.3) 

This general method of obtaining the values of the coefficients is thus 
the same as that familiar in the special case when the orthogonal func- 
tions are the sine and cosine terms used in a Fourier expansion. Since 
the values of the coefficients are definitely determined by (65.3), an 
expansion in terms of a given set of orthogonal functions is unique. 

In order that a successful expansion can be made in the form (65.2) 
we need some criterion for the convergence of the series and for the 
completeness of the set of orthogonal functions. The following remarks 
may be made in this connexion. 

In the quantum mechanics, expansions such as the above are used 
for functions f{g) such that the integral J \f{g)\^dq, over the whole 
range of coordinates, exists in the sense of having a definite non-infinite 
value. And it is sufficient to regard an expansion as satisfactory when 
it gives comergmce in the mean, i.e. when 


lim J 1/(5')- 2 r (65.4) 

rather than to impose convergence of the series in the ordinary sense 
as a necessary requirement. 

By evaluating the indicated square in (65.4) and making use of the 
normalization and orthogonality expressed by (65.1), the above equation 
can readily be shown equivalent to a relation, having the simpler form, 




(65.5) 


The application of this equation, using values of the c* determined by 
(65.3), can be used as a test for convergence in the mean, and therefore 
also as a test for completeness since its fulfilment would show that no 
further eigenfunctions could be added to those already employed. 

In making use of a series expansion (65.2) of the kind that we have 


suggested. 


/(ff) = |;Ci%(5), 


(65.6) 


it will be appreciated that some of the coefficients Cj may turn out to 
be zero, in case f{q) has special properties, such, for example, as being 
an odd or even function, and it is also apparent that cases may arise 
where only a few eigenfunctions may be necessary to give a satis- 
factorily approximate representation of a function /(j). 

From a more general point of view, nevertheless, it is evident that 



§65 


EXPANSIONS IN TERMS OF EIGENFUNCTIONS 


257 


we must contemplate the appearance in our series of all possible eigen- 
functions corresponding to the different eigenvalues of the observable 
or observables of interest. In case these eigenvalues present a con- 
tinuous spectrum, this means that we must make use of the continuous 
range of corresponding eigenfunctions; and in the case of degeneracy it 
means that we must make use of all the independent eigenfunctions 
that may correspond to a single eigenvalue. If, however, we do include 
all possible eigenfunctions corresponding to the different eigenvalues 
of any selected observable or set of observables, it appears possible to 
obtain satisfactory expansions of the general type (65.6) for all the 
functions of f(q), with J \fiq)\^dq existing, which we need to treat by 
this method for the purposes of the quantum mechanics.f 

In the practical treatment of actual problems the simple summation, 
expressed by (66.6) with each independent eigenfunction labelled by 
its own subscript k, is often replaced by formalisms which give explicit 
recognition to the occurrence of continuous ranges of eigenvalues, or 
to the nature of any degeneracy that may be present. This may involve 
the introduction of integration in place of summation, and the designa- 
tion of eigenfunctions in ways already mentioned to indicate their 
correlation with particular eigenvalues. Nevertheless, since the process 
of integration must be regarded as the limit approached by processes 
of summation, and since we are regarding our subscripts k as labeUing 
aU possible eigenfunctions, the theoretical aspect of the problem is not 
fundamentally altered by such changes in formalism. Furthermore, 
a variety of such formalisms, which are often quite complicated in 
structure, are employed in different special cases. Hence in the re- 
mainder of the present chapter we shall demonstrate those uses of 
eigenfunction expansions which are of interest to us, with the help of 
the simple formalism (65.6), calling special attention when necessary 
to aspects of the situation resulting from continuous spectra or from 
degeneracy. 

66. Expansion of the probability amplitude \jt{q,t) 

(a) E3q)ansion at a given time of interest. For the purposes of the 
quantum mechanics, a very important application of the foregoing 
possibility for series expansions arises when the function /(j) is itself 
taken as the probability amplitude 

«(2) = (66.1) 

t See von Neumann, Mathmwtieche Qrundlag&n. der QiumUnrnechanik, Berlin, 1932, 
chapter ii. 

3595.25 


Ll 



258 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


for the state of a system at some particular time of interest tg. We can 
then write «(g) = (66.2) 


as an expansion for the probability amplitude «(g), which describes 
the actual state of the system, in terms of the eigenfunctions Up.{q), 
which describe states characteristic of some selected observable or set 
of observables. 

It will be noted that we might expect an expamion which converges 
in the mean to give a sufficiently satisfactory expression for probability 
amplitudes, since the failure of such expansions at individual points 
would not appear serious for the uses made of probability amplitudes 
in the quantum mechanics. It will also be noted that this method of 
expression is in agreement with the general principle of the super- 
position of states discussed in § 59. 

The possibility of expressing the instantaneous state of a system in 
terms of the eigenfunctions ^(g) characteristic of some particular 
observable quantity F is specially valuable when we are interested in 
the probabilities that the system will exhibit one or another of the 
different possible eigenvalues of that quantity. To investigate this use 
we note, in accordance with the general expression (57.4) for obtaining 
expectation values, that the nth power of J* in the state specified by 
(66.2) would exhibit the mean value 


= J tt*(g)F®«(g) dq 

= J cf dq 

= I (66.3) 


where the third form of writing depends on the association of the 
eigenvalue J* with the eigenfunction %, and the last form of writing 
makes use of the normalization and orthogonality of eigenfunctions 


Since equation (66.3) is valid for any integral power n, we now see 
that the probabilities in the state of interest of finding different values 
Fjc for our observable quantity F would be determined by the squares 
of the corresponding coefficients in the series expansion (66.2). If the 
eigenvalue JJ. is that for a single non-degenerate state, the probability 
of finding this value would be equal to the square of the corresponding 
coefficient ^ 



§66 EXPANSION OB’ THE PEOBABILITY AMPLITUDE 269 

and in the case of gr-fold degeneracy, where the same eigenvalue F^. 
corresponds to the g different eigenfunctions the proba- 

bility of finding this value would be given by the sum of the squares 
of the corresponding coefScients 

The above expressions give actual rather than merely relative proba- 
bilities, provided, of course, that the original probability amplitude 
was normalized to give 

J «*(?)«(?) ^ = ^ \<^k? == 1- (66.6) 

(6) Expansion as a function of the time. By an obvious extension 
the above method of expansion can also be used to give a form of 
expression for the probability amplitude of the system not only at one 
time but as a function thereof. To do this, since the probability ampli- 
tude is assumed expressible at any given time by an expansion with 
constant coefficients Cj., we can evidently change to coefficients 
which vary with time, and obtain a satisfactory expression for the 
probability amplitude as a function of the time by an expansion of 
the form j «*(<)%(?). (66.7) 

The coefficients in this series are seen by the same considerations that 
led to (65.3) to be given by 

«»{*) = / <(?)>/'(?. 0 (66.8) 

Furthermore, in case the eigenfunctions UjJiq) are chosen so as to corre- 
spond to the eigenvalues of some selected observable quantity, we 
see that the probabilities of finding the different possible values for 
that quantity would be given by expressions of the form 

I%(<)1^ (66.9) 

in the absence of degeneracy, and 

(66.10) 

in the ease of p-fold degeneracy. 

(c) Special case of e3q>ansion in energy eigenfunctions. The eigen- 
functions v^(g) used in Tnfl.TriTig expansions of the kind we are discussing 
may be picked out to correspond to the eigenvalues F^. of any particular 
observable quantity that we may desire, or to correspond to the eigen- 
values Fj^, etc., of more than a single observable. A case of frequent 



260 


THE ELEMENTS OF QUANTUM MECHANICS Chap.VH 


interest proves to be that of expansion in terms of the solutions corre- 
sponding to the eigenvalues of the energy. The coefficients ajt) are 
then found in the case of an undisturbed system to have a particularly 
simple form, which agrees with the simple time dependence for sta- 
tionary state solutions alread.y discussed in § 60. 

To see this we may substitute the expression for the probability 
amplitude (66.7) into the Schroedinger equation (67.5) and write 

Jg 

which is readily seen to lead to 

k 

when we remember that the operator H is not dependent on t for an 
isolated system, and that we have expanded in terms of energy eigen- 
functions. Multiplying this last expression by and iutegrating 

over the whole range of coordinates q, we then obtain, with the help of 
the normalization and orthogonality of the eigenfunctions, the result 

aJt)E^+~ = 0 

which has the solution 

an(0 = Cn (66. 11) 

where the quantities c„ are constants. 

Hence the expansion of an arbitrary probability amplitude iu terms 
of steady state solutions has the specially simple form in the case of 
an isolated system 

= ( 66 . 12 ) 

The constant coefficients in this expression are seen by our previous 

methods to be given by 

= J ‘u*{q)6'^^l^^'‘*il>{q, i) dq. (66.13) 

Furthermore, the probability of finding a particular value for the energy 
Ej^ will be given at any time by the conskmt expressions 

W{Ej,,t)=\Cj,\^ (66.14) 

l^k+a 

or W[E^,t)= t |Ci|* (66.16) 

respectively in the absence or presence of a gr-fold degeneracy. This is 
in agreement with our previous finding in § 63 (c) that the relative 



§66 EXPANSION OF THE PROBABILITY AMPLITUDE 261 

probabilities for the system to exhibit different values of its energy 
would remain constant for an isolated system so long as it remains 
undisturbed. 


67. Transformation theory 


In the development of quantum mechanical methods so far given we 
have for the most part used the f-language in setting up and solving 
our problems, although we have appreciated and sometimes used the 
possibility of ta king a momentum instead of a coordinate representa- 
tion. In the present section we shall give a brief treatment of the 
quantum mechanical transformation theory, ')■ which investigates the 
possibility of using representations corresponding to any desired ob- 
servable quantity or compatible combination of observables. This branch 
of the theory is to some extent the analogue of the theory of canonical 
transformations in the classical mechanics ; and fairly general possibilities 
of using other than coordinate language would seem necessary if our 
new mechanics is to be regarded as a satisfactory extension of the old. 
Our development of the transformation theory will show that such 
possibilities do exist without essentially new additions to our postula- 
tory basis, and will provide us with a very general kind of language 
which is useful in the treatment of fundamental quantum mechanical 
problems. 


{a) Probability amplitudes in general. The possibility of a variety of 
different modes of representing the state of a quantum mechanical 
system is already implicit in our previous equations (66.7) and (66.8) 
for the expansion of probability amplitudes, which we may now rewrite 
in the forms ^ n ^ 

and «(w, 0 = J dff- (67.2) 


In accordance with these equations we see that the state of a quantum 
mechanical system which is specified at any time i by a knowledge of 
the probability amplitude ^(g, ^) as a function of the coordinates q is 
also equally well specified by a knowledge of the quantity a{n^t) as 
a function of the index n. 

Either function can be equally well calculated from the other with 
the help of the eigenfunctions uijc^q) and ^^*(^^,^) which play the role 
of transformation functions between the two modes of representation. 
Furthermore, the quantity a{n, t) may be appropriately designated as 

t First developed by Dirac and by Jordan; see in particnlftr The Principles of 
Quantum Mei^imdcs, by Dirac, second edition, Oxford, 1936. 



262 THE ELEMENTS OE QUANTUM MECHANICS Chap. VH 

the transformed probability amplitude, corresponding to the observable 
or observables whose eigenvalues are denoted by the different values 
of the index n, since the probability of finding such eigenvalues will 
be given by ^ 3^ 

in agreement with (66.10). 

To be sure, the functions ifi(g,t) and a(Is,t) enter the two equations 

(67.1) and (67.2) m a somewhat unsymmetrical manner. This arises, 

however, because the formalism is devised, on the one hand, to give 
explicit recognition to the consideration that the coordinates g are 
known to exhibit a continuous spectrum of eigenvalues, but, on the 
other hand, to leave the possibility open for the eigenvalues corre- 
sponding to the index n to exhibit discontinuous or continuous spectra. 
When the eigenfunctions u{k,q) are such as to correspond simul- 
taneously to the eigenvalues off independent observables one 

for each degree of freedom, and these observables do exhibit continuous 
specfra, summation over the index k can be replaced by an appropriate 
integration, and equations (67.1) and (67.2) can be replaced by the 
entirely symmetrical formulation 

••• ~ J J ••• ••• 9f) ••• 

and 

a(ii ... Ff,t) = j ... j tPiq^ ... q,, t)u(qi ... q^, F^ ... Fy) dq^ ... dq,, (67.5) 

where we introduce the notation 

- ?/. -Pi ••• Ff) = - ?/) (67.6) 

in the interests of symmetry. 

It is of interest to note that our originally postulated relations 

(57.2) between the probability amplitudes ^{q, t) and t), appearing 
req)eetively in coordinate or in momentum representations, can now be 
regarded as special cases of the above more general relations for trans- 
forming to other than coordinate languages. Thus, in the case of a 
system of one degree of freedom, our previous relations (57.2), 

4>{i,t) = J* (f>{p,t)ef^^^'>p« dq 

and ^{p,t) = h-it j }j>{q,t)e~^^^l^^^ dq, 

can now be regarded as special cases of (67.4) and (67.5), where the 
transformation functions 

and = '^*(PyS) ~ 



§67 TRANSFORMED AMPLITUDES AND OPERATORS 263 

are seen in agreement with (64.28) to be appropriately normalized 
eigensolutions for states characteristic of the momentum p. 

It is also of interest to note the formal possibility of regarding the 
eigensolutions for states characteristic of q itself as playing the role 
of transformation functions. Thus in the case of a system of one degree 
freedom, equation (67.6) would assume the form 

= J 0(ff.O8(?— ff') dq = tli(q,t)^, 

if we substitute our previous eigensolution (64.24) for a state charac- 
teristic of the coordinate eigenvalue q'. The result is seen to be valid 
although trivial. 

(6) Operators in general. When a coordinate representation (^(j, /) is 
being employed, the expectation value for an obs^vable quantity 
F{q,p) will be given by the relation 

1^ = J^*(g,*)F§&(g,«)dgr, (67.7) 

assuming the operator F to have a known form, in accordance with 
§ 55, suitable for operating on a function of the coordinates q. Sub- 
stituting the transformation equation (67.1), we can also write this 
relation in the form 

F = ^j a*{n, t)u*{n, ff)Fo(fc, f)u(]e, q) dq. 


and this can also be expressed in the simpler form 

j' = ^ *)JU «(*. 0. (67.8) 

where the quantities F^j^ are the elements of the Hennitian matrix 
defined by = J «*(», q)'Fu{Jc, q) dq. (67.9) 


(67.9) 


Defining the transformed operator F^®^^ soitable for use in the o(», i) 
representation, by <). (67.10) 


we see that the new expression for the expectation value F as given 
by (67.8) has a form _ 

F = ^ a*{n, «)F<“)a(?i, t), (67.11) 


which is the appropriate analogue of the original form of expression 
for F as given by (67.7) in the ^(g, t) representation. 

This illustrates the general possibilily of transforming such operators 
to any desired language. It also shows, as already remarked, that the 
operators in the quantum mechanics which correspond to observable 



264 


THE ELEMENTS OF QUANTUM MECHANICS Chap, VH 


quantities may assume a variety of forms. These include, in addition 
to the simple diiferential operators which we often have in the case of 
coordinate representations, operators whose action is expressible as in 
(67.10) by summation with the components of appropriate matrices, 
and operators where such summation is replaced by integration. The 
possibility of reducing the expression for the action of an operator to 
a simple form, merely involving multiplication and differentiation, 
must be regarded as the exception rather than the rule. The frequent 
simplicity of operator formulation in coordinate language, together 
with the physical insight afforded by such language, makes the use 
of coordinate representations specially convenient in the quantum 
mechanics. 

(o) The Schroedinger equation in general. As the most fundamental 
relation in our development of the quantum mechanics, we have the 
Schroedinger equation, which can be written in the ^-language in the 


2iri Bt 


(67.12) 


This equation can be readily re-expressed in an analogous form in any 
other mode of representation. 

Introducing the transformation equation (67.1) mto (67.12), and 
noting that the operator H in the case of an isolated system will depend 
on the coordinates alone, we obtain 

X aih, g)+^ 2 g) = 0- 

* k 

\ 

Multiplying throughout by u*(n, q) and integrating over aU values of 
the coordinates q, we then obtain, with the help of the normalization 
and orthogonality of the eigenfunctions u{k, q), the result 

J a{l6, t) j u*{n, q)Ku(k, q)dq-\-^ ^ 


and this can be re-expressed in the simpler form 

(67.1S) 


where the quantities ^8^ are the components of the Hermitian matrix 
defined by p. — I* (67.14) 


^k = I v,*{n,q)lLu{k,q) dq. 


Noting our previous definition (67.10) of the appropriate form for 
operators in the a{n,t) representation, we see that (67.13) is the im- 



§67 


TRANSFORMED SCHROEDINGER EQUATION 


265 


mediate analogue, in our present language, of the original Schroedinger 
equation (67.12). The relation given by (67.13) may hence be called 
the transformed Schroedinger equation, 

{d) The Hermitian matrices corresponding to observable quantities. 
The above-defined Hermitian matrices, which correspond to observable 
quantities F with components 

q)^u{h, q) dq (67.15) 


in any particular mode of representation specified by the choice of 
eigenfunctions u{k, q), prove sufficiently important to warrant a special 
examination of their properties. 

Owing to the Hermitian character of the operators F, which corre- 
spond to observable quantities in the quantum mechanics, it will be 
immediately seen from the definition of Hermitian operators (55.25) 
that the matrices under consideration will themselves be Hermitian 

= (67.16) 


It will also be noted, in accordance with our general eiqpression (57.4) 
for the expectation value of a quantity, that any diagonal component 

of such a matrix, ^ r , 

^kk = J “*(*. s) dq, (67.17) 

will give the expectation value 

^kk = (67.18) 

for the observable F when the system is actually in the state specified 
by the single eigenfunction ^(i,j) then involved. 

Xiet us now consider two different Hermitian operators F and G 
corresponding to two different observable quantities F and (?, and let 
us consider their action on the members of a set of eigenfunctions w(&, q) 
of the kind that we have been considering. In accordance with our 
general possibility for developing functions of the coordinates g, we can 
evidently write as special cases 


F«(X;, ?) = 2 qWnk -^ = f ?)F«(*, q) dq 


(67.19) 


and G«(Z,g') = "with [‘u,*{n,q)Gu(J,,q)dq. 


Combining these expressions, we then obtain 

f [F«(i, J)]*G«(Z, q)dq = '£F^O^\ u*(n, q)u(m. q) dq, 

J n,m 

or, in accordance with the Hermitian character of the operators G and 

3586.25 jIX 



266 THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 

F (see (55.25) and (55.26)), 

j" «(/,g)[GFa(A%g)3* d? = J u*ik,q)FGu{l,q) = ^ 
or finaUy, by (67.16), = 2 ^kn (67-20) 

ft 

We thus see that the matrix elements corresponding to the product 
FG of two operators F and G are related to the matrix elements for 
the two operators themselves in accordance with the rule for matrix 
multiplication. Hence (67.20) can also be expressed in matrix language 
in the form = 1 |jP1| HGH. (67.21) 

Such matrices were of importance for the formulation of the so-called 
matrix mechanics by Heisenberg, Bom, and Jordan, f which antedated 
the wave mechanics of Schroedinger. In accordance with preceding 
discussions, it will be readily seen that the matrices g„,„, and 
corresponding to the coordinates, momenta, and energy of a system will 
have the following properties when obtained with the help of a set of 
eigenfunctions «„(?) corresponding to the energy eigenvalues for the 
system. The matrices for the coordinates and momenta will turn out 
to be Hermitian and to be subject to commutation rules that result 
£rom the commutation properties of their operators as given by (55.38); 
and the energy matrix will turn out to be diagonal, with eigenvalues 
corresponding to the eigenfunctions u^{q) employed, when it is 
evaluated by treating the coordinates and momenta in the clas^cal 
Hamiltonian as Hermitian matrices obeying the above commutation 
rules and the ordinary rules of matrix manipulation. This, then, ex- 
plains the original procedure of Heisenberg in accordance with which 
the ei^nvalues of energy for a sysbem were to be obtained by purely 
algebraic means by finding matrices for the coordinates, momenta, and 
energy having the above properties. In practice the procedure proved 
to be a feasible one only in a few simple cases. 

To complete this brief discussion of the matrices corresponding to 
observable quantities, a generalization of some interest must be men- 
tioned. The matrices so far defined by expressions of the form 

= J “*(», 2 )F«(*, q) dq (67.22) 

are independoit of the time t except in so far ae the operator F might 
itself be an explicit fimction of t. A more general time-dependent matrix 
corresponding to the observable quantity F may be defined by 

= J dq, 

t Phs». 33, 879 (1925); 34, 868 (1926) ; 35, 867 (1925-6). 


(67.23) 



§67 


MATRICES CORRESPONDING TO OBSERVABLES 


267 


where and axe any pair of possible solutions of the Schroedinger 
equation for the system under consideration in their full time depen- 
dence. These matrices will be Hermitian with 


(67.24) 

and their diagonal elements 

F^iP) = K(P) (67.26) 

will give the expectation values for the observable F as ftmctions of 
time when the system is carrying out the behaviour corresponding to 
the different possible solutions of the Schroedinger equation. 

Differentiating (67.23) with respect to the time, on the assumption 
that F has no explicit dependence on i, and making use of the Schroe- 
dinger equation (57.6), we obtain 

= X J dq 

= X J ^ 

= ^ J ^*(HF-FH)^p dq, (67.26) 


where the last two forms of wiitmg are consequences of the Hermitian 
character of the Hamiltonian operator H for the energy of the system 
and of the operator F for the observable quantify of intm^. (See 
(66.25) and (66.26).) 

The result given by (67.26) can also evidently be expressed in the 

d'Fafiit) ^TTr^rj irrii F71 


and 


= ^\\FH-HF\\ = ^\\F,H\\. 


Comparing with our previous classical equation (11.4) for the rate 
of change in the exact value of a function F of the coordinates and 

27T 

momenta, we see that the quantities — and — liJ',£fl| may 


be r^arded .in the present connexion as the quantum mechanical 
analogues of the classical Poisson bracket {F,S}. Purfliermore, com- 
paring the above with our previous equation (63.3) for the rate of 



268 


THE EliEMEUTS OF QUANTUM MECHANICS Chap. VII 


change in the exp&Mion value of a function F, we see that the diagonal 
terms appearing in the present equations are equivalent in import to 
the earlier equation. 

(e) Unitary transformations between different quantum mechanical 
representations. In the preceding parts of this section we have examined 
the consequences of transforming the treatment of a quantum mechani- 
cal problem from a coordinate representation, in which the state of the 
system is given by a probability amplitude t) for the coordinates q, 
to some other representation, in which the state of the system would 
be given by a probability amplitude a{1c, t) for the index h labelling the 
eigenvalues and eigenfunctions u{]lc, q) for some other kind of observable 
quantity or quantities. The results obtained are evidently general in 
character since they merely depend on the assumption that the eigen- 
functions u(k,q) form a complete, orthogonal set suitable for the expan- 
sion of functions of the coordinates. Hence results of similar form will 
be obtained if we transform to any other such representation, in which 
the state of the system would be given by a probability amplitude, say 
for the index r labelling some other set of eigenfunctions, say 
v{r, q). It will be useful and instructive to examine the transformation 
between two such modes of representation, corresponding to the ampli- 
tudes a(i, t) and b{r, t), since our previous expressions have been affected 
by the lack of symmetry arising from a formalism which was chosen, 
on the one hand, to give explicit recognition to the continuous range 
of possible eigenvalues for the coordinates q, and, on the other hand, to 
the possibility of labelling individual eigenfunctions nije, q), correspond- 
ing to some chosen observable or observables, with a specific index k. 

To cany out the investigation we may begin by considering the 
interrelation between any two different complete sets of independent, 
normalized, orthogonal eigenfunctions u(k,q) and v(r,q), which are 
suitable for the expansion of functions of the coordinates q, in some 
problem under consideration. Since the two sets of eigenfunctions 
a(k, q) and tj(r, q) are by hypothesis both of them suitable for the expan- 
sion of functions of q, it is evident that the members of either set may 
be expressed by an expansion in terms of the members of the other set. 
Using for the time being the shortened notation % and v^. to designate 
the members of the two sets, these expansions may be written in the 
forms = and % = (67.28) 

where ^ and 8^^ are the coefficients needed for the two expansions, 
and the particnlar order of writing has been chosen to agree with that 



§ 67 UNITARY TRANSFORMATIONS 269 

customary in matrix multiplication. IVom the normalization and 
orthogonality of the two sets of eigenfunctions we see that these coeffi- 
cients, or components of the transformation matrices, will have the 

values /. - 

= J “f = J (67.29) 

Furthermore, in accordance with the foregoing we note the successions 
of relations 

= g J d? = I 

and = f d? = 2 f dq = ^ StS^, 

^ r.sJ r 

which gives us the pair of relations 

2^S-4. = Sr, and 2StS^ = 8a. (67.30) 

k r 

A transformation matrix, having components Sjg., which imply the 
existence of those for the inverse transformation in the manner 
given by the second equation (67.29), and which are subject to the 
relations (67.30), is called unitary-, and the transformations effected by 
it are called unitary transformations. 

With the help of the foregoing we may now consider the transforma- 
tion relations between the probability amplitudes a{Jc,t) and b{r,t), 
corresponding to the two modes of representation. Since the state of 
the system as expressed by il>(q,t) can be expanded in either set of 
eigenfunctions u{Je,q) or v{r,q), we can write the equations 

0(?> <) = 2 *)“(*. ff) = 2 ^(»’. *)«(»■» 3)- (67.31) 

k r 

Substituting from (67.28), this then gives us 

2 Hr, t)v{r, q) = y a{h, t)v{r, q)S;i^ 

r ^ 

t It is sometimes illuminating to regard the elements of the above transformation 
matrices as correlated with unitary transformation operators S and S“^, which establish 
a linear correspondence between the two sets of functions u and v in accordance with 
the expressions — Uj^ (for all r and Jc). 

Equations (67.29) can then be expressed in the form 

= J wfSwj. dq and = jSji = J dq. 

It will be noted that these imitary operators are not Hermitian and are thus quite 
different in character &om the quantum mechanical operators F that correspond to 
observable quantities. It will be noted, however, that the unitary operators S and S”^, 
by which we transform at a given time between two quantum mechanical modes of 
representation a and b, have a character similar to the unitary operators U(<) and 
U(— i), introduced later in §96 (d), by which we transform with a given quantum 
mechanical mode of representation between two different times 0 and t. 



270 


THE ELEMENTS OP QUANTUM MECHANICS Chap. VH 


for any arbitrary state, and hence, from the independence of the eigen- 
fonctions v{r, q), 

and, similarly, a{h, t) = J,S^b{r, 0 = 2 Hr, 0(^;*^)* (67.32) 

r r 

as the relations between the two kinds of probability amplitudes. This 
will then permit us to re-express a state of the system in either language. 

It will also be of interest to consider the transformation relations 
between the matrix components 

= /«•(«, and = J«*(s, 3 ')Fv(r, 5 ') dg, (67.33) 

which would correspond in the two modes of representation to the same 
observable quantity F, Substituting from (67.28) into the second of 
the above equations, we obtain 

Jg) 2 f u*(l,q)StFu{h,q)8j^dq, 
which gives us, in accordance with the first of equations (67.33), 

FS> = 2 = 2 

Ifk 

and,8iimlarly. =2 = 

».r «,r 

or in matrix language 

and |1J'<“5|| = 


(67.36) 


as the relations between the Hermitian matrix components and 
matrices corresponding to an observable quantity F in the two modes 
of representation. This will then permit us to re-express, in either 
language, relations which depend on these components, such as the 
generalized Schroedinger equation or the expression for the computa- 
tion of expectation values. 

The completely symmetrical character of the above transformation 
relations depends on the circumstance that our formalism now gives 
no recognition to the possibility that the indices k and t appearing in 
the probability amplitudes a{k,t) and b{rj) might really correspond in 
the one case to a continuous and in the other case to a discrete manifold 
of values. The general features of the transformation theory are, how- 
ever, most clearly appreciated with such a symmetrical if not com- 
pletely descriptive formalism. 

To complete this brief treatment of unitary transformations, it is to 
be pointed out that the successive application of such transformations 



§67 


UNITARY TRANSFORMATIONS 


271 


will itself be a iimtary transformation. To see this let us now consider 
the successive transformations 


= and = 

together with the inverse transformations 

“fc = 2 and w, = 2 


(67.36) 


(67.37) 


We must now show that the direct transformation from the representa- 
tion corresponding to the eigenfunctions % to that corresponding to 
Wq would have unitary character. Combining the equations (67.36), 


we have 
with 


Wc = 2 = 2 

k,r k 


^kc — X ^kr 

r 

and, similarly, from (67.37) 

% = 2 = 2 

c,r c 

with = 


(67.38) 


(67.39) 


where and are the components of the new transformation 
matrices. With the help of the unitary properties of the individual 
matrix components S^. and as given by (67.29) and (67.30), however, 
the new components are also seen to have the properties 


^lT*8t = 2 St T* = Vt 


(67.40) 


k kjTtfi TJ6 8 

and, similarly, 2 ^e — ^h» (67.41) 

C 


which completes the demonstration of the unitary character of the new 
transformation matrices. 

(/ ) Concluding remarks on the general quantum mechanical laxiguage 
provided by the transformation theory. As an important consequence 
of the transformation theory, we are now provided with a very general 
hind of language for expressing the principles of the quantum mechanics. 
Thus our original postulates for the quantum mechanics, as summarized 
in § 57 in the q>ecific language corresponding to the coordinates q or 
to the momenta p, may now be re-expressed in the general language 
corresponding to an index n by which the eigenstates for any kind of 
observable quantity or suitable set of such quantities can be labelled. 

For the relation between the probability ^(n, t) of finding the eastern 



273 


THE ELEMENTS OP QUANTUM MECHANICS Chap. VII 


in the state labelled by n and the probability amplitude a{n, t) for that 
state, we have ^ (67 42 ) 

where the values of the probability amplitude a{n,t) can be referred 
back if desired to an expression of the state of the system in coordinate 
language with the help of (67.2). 

For the transformation to a different language, involving probability 
amplitudes 6(r, f) for states of a different kind r, we have 

a{l', f) = 2 0 aiid b{r, <) = 2 (6'^ •■^3) 

r k 

where the components of the unitary transformation matrix Sjcr depend 
on the two kinds of states in a manner which can be referred back to 
the properties of the system expressed in coordinate language with the 
help of (67.29). 

For the action on the probability amplitude a{n, t), of an operator 
corresponding in the present language to an observable quantity F, 
we have I««)fl(»,t) = | (67.44) 

where the components of the Hermitian matrix can be referred 
back to the properties of the system expressed in coordinate language 
with the help of (67.9). 

For the mean or expectation value of an observable quantity F we 
= 2 «*{». t). (67.45) 

nje 

And for the change in state with time we have the generalized Schroe- 


where the are components of the Hermitian matrix correi^onding 
to the energy of the system. 

By comparison it will be seen that these five equations are indeed 
the analogues in general language of the five equations (57.1-5), which 
we took as the original postulates of the quantum mechanics. These 
equations thus provide a generalized statement of the principles of the 
quantum mechanics, which can be regarded as valid in any quantum 
mechanical language that may be of interest for a particular problem. 
The form of the equations may seem a somewhat inapprc^riate one 
when tire specific language desired is actually one that corresponds 
to observables exhibiting continuous rather than discrete q>ectra of 
eigenvalues, and reference back to coordinate language may often be 



§ 67 GENEEALIZED QUANTUM MECHANICAL LANGUAGE 273 

desirable in treating a specific problem. Nevertheless, the provision of 
a generalized language is very important since we can now deduce 
consequences which will be valid in any quantum mechanical language 
that may later be chosen as of interest, and since the generalized lan- 
guage is one which often gives a specially good insight into the charac- 
teristic features of the quantum mechanics. For this reason we shall 
often use the above generalized language in developing the fundamental 
principles of the quantum statistical mechanics. 


68. The method of variation of constants 


The possibility of obtaining a transformed Schroedinger equation, 
appropriate for use with any mode of representing the state of a system, 
as discussed in the last section, finds a specially important application 
when the state of the system is represented by an expansion in terms 
of so-called unperturbed energy eigenfunctions^ which correspond to a 
Hamiltonian operator which differs from the true Hamiltonian for the 
system by a small term which is regarded as a perturbation. This 
method of treatment was first developed by Dirac, f and may be called 
the method of variation of constants, since the constant coefficients, 
such as the in equation (66.12), which would appear if the state of 
the system were expanded in terms of the true energy eigenfunctions 
for the system must now be replaced by coefficients which are allowed 
to vary with the time. 

The method of variation of constants is very important for the 
quantum mechanics in gaining an insight into the changes that take 
place in a system with time, since our natural interest often lies in the 
nearly steady states (unperturbed energy eigenstates) which a system 
can exhibit, and since the differential equations for the change in such 
states with time which the method provides can be readily integrated 
by an approximate method. Hence we can now afford to devote a sec- 
tion to the method, even though we shall again discuss it in Chapter XI 
when we undertake our general consideration of the changes that take 
place in quantum mechanical systems with time. We shall develop 
the method ah initio without reference to the general features of the 
transformation theory. 

(a) Derivation of the differential equations. Let us consider a quan- 
tum mechanical system which can be treated with the help of a Hamil- 


tonian operator 


H = 


( 68 . 1 ) 


t Diiac, Froc. Boy. 8oc. A, 112, 661 (1926); 114, 243 (1927). 
Nn 


3695.25 



274 


THE ELEMENTS OF QUANTUM MECHANICS Chap. Vn 

which can be regarded as the sum of two terms, the unperturbed Hamil- 
tonian corresponding to a nearly precise expression for the energy 
of the system, and a perturbation term V corresponding to the remainder 
necessary for an actually exact expression of the energy. For example, 
in the ease of a dilute gas the term H® could correspond to an expression 
for the internal kinetic energy of the system regarded as composed of 
non-interacting molecules, and the term V to the added potential energy 
needed for an expression that also allows for the effects of interaction 
at times of collision. Or in the case of an atom in equilibrium with 
radiation the term H® could correspond to the sum of the energies of 
the atom and the electromagnetic field regarded as having no effect on 
each other, and the term V to the effects arising from the actual inter- 
action made evident by the fact that the atom could absorb and emit 
radiation. 

Without real loss in generality we can consider our system as isolated 
with and V not explicitly dependent on the time. 

Using the unperturbed Hamiltonian H®, we can then obtain a set of 
eigenfunctions «„(?) for the system with the help of the equation deter- 
mining characteristic states 

= E%u„{q), ( 68 . 2 ) 

where the quantities are the eigenvalues coiresponding to the 
operator H®, degeneracy being allowed for by the posdbility that the 
eigenvalues for different values of the index n might actually turn out 
to be equal. These eigenvalues may he conveniently called the 
unperturbed energy eigenvalues for the system. In case the effect of 
the perturbation term V is sufficiently small, they would he nearly 
equal to the actual possible eigenvalues E„ of the true energy of the 
system. 

Since the eigenffinotions «„(2) will form a complete, normalized, 
orthc^onal set, any state of the system can he conveniently represented, 
in accordance with our previous discussion, as an expansion in terms 
of these functions. In the absence of the perturbation term V, this 
expansion would have at all times the form 

in accordance with our previous equation (66.12), where the c^j. are con- 
stant coefficients. For the actual system, however, in the presence of 
the p^turbation V, we must use a more general form of expansion 

^{q, <) = I c*(f)«*(g)e-<8>rt W, (68.3) 



§68 


VARIATION OF CONSTANTS 


275 


where we now allow the coefficients c*(«) to vary with the tinift in the 
inanner actually demanded by the Schroedinger equation of motion. 
The expansion given by (68.3) is, of course, entirely possible, in accor- 
dance with the completeness of the set tti(g), and differs in form from 
our earlier general expansion (66.7) only because it is now convenient 
to separate out from the total time dependence the periodic part corre- 
sponding to the exponential term in (68.3). 

Substituting (68.3) into the Schroedinger equation 

making use of the complete expression for the energy operator (68.1), 
and also making use of the expression (68.2) by which the eigenfunctions 
%•(?) defined, we obtain 

^ ^ c*(«)V%(gr)e-<a"‘Vft)£St4. 

+ ^ 2 T 1 = 0. 

k 

Noting that the first and last terms of this expression will cancel, and 
multiplying through by and integrating over the full 

range of values for the coordinates g, we then obtain, with the help of 
the normalization and orthogonality of the eigenfunctions «*(?), the 

0 . 

and this can be rewritten in the final form 


k 

where the quantities are defined by 

Kfc = J <(g)v%(?) dg, 

and are seen to be the components of a Hermitian matrix with 


(68.5) 

( 68 . 6 ) 
(68.7) 


Equation (68.5) gives the desired explosion for the transformed 
Schroedinger equation, which m principle will allow us to calculate the 
change with time in any selected coefficient c„(f) in the expansion (68.3) 
from information as to the mstantaneous values of all the coefficients 


A knowledge of the values of these coefficients as a frmction of 
the time is of course important, since in accordance with our original 
expansion (68.3) it is evident that the probabihiy at any time of finding 



276 


THE ELEMENTS OF QUANTUM MECHANICS Chap. VII 


the system in the state indicated by the index n ■would be given by 

Kit) = c*(i)6„(<), (68.8) 

pro'rided, of course, that the solution is normalized to give 

= = (68.9) 

(6) Approximate integration for a special C 2 ise. The set of differential 
equations given by (68.5) will in general be infinite in number. Never- 
theless, their approximate integration will be simple in case we start 
the system off at time = 0 in a special state, "with all the coefficients 
Cjt(/o) equal to zero except one, say c„(#o). which we put equal to unity; i.e. 

Cnih) = 1. <’&(U = 0 i^^ ^)- (68.10) 

Under these circtunstances the system would be definitely in the state 
indicated by the index n at the time i = fg = 0. 

So far ■we have made no use of the smallness of the perturbation 
term V and have introduced no approximation. If, however, we now 
assmne a successful separation of the energy expression into two parts 
such that the effect of the term V in causing the coefficients to vary 
with the time ■will be small, we can carry out an approximate integra- 
tion by treating the coefficients as nearly retaining the constant values 
(68.10) over a short time interval in the neighbourhood of < = 0. 

For the rate of increase in the coefficient c^(t) for any state m other 
than that in which the system starts, we then obtain from (68.5) 

__ 2rrt‘ ^(Si-Sg)i 

dt ~ h 


and this can evidently be integrated to give 


®in(0 — 




( 68 . 11 ) 


as an approximately correct expression for the coefficient cj^t) in the 
neighbourhood of the time < = 0. Multiplying (68.11) by the corre- 
sponding expression for the conj^ogate complex quantity we can 
■write, in accordance ■with (68.8), 

Trjo = 4(i)cjo=|F„,!*?=^ 


or, by a simple transformation, 

Kit) = \v„ 


iEi-E^Y 


,, 4am^(^A)(^-.EM 

1»>»I / EIO .JJ0J2 ’ 


( 68 . 12 ) 


and have thus obtained an expression, which should be valid in the 
neighbourhood of « = 0, for the probability of finding the i^stem in 
a state m different from the state n in which it started. 



§68 


VARIATION OF CONSTANTS 


277 


Our treatment has not yet given us an expression for the decrease 
in the probability WJJ) of fi n di n g the system in the original state n, 
since it has been based on the approximation cjf) = 1 over the short 
time interval involved in the integration. Nevertheless, if we now go 
to the next stage of approximation and substitute the values of cjlf) 
given by (68.11) into (68.5), we can then compute the rate of change 
c„,(0 with the time and thus obtain an expression for WJty The com- 
putation is a little long, however, and we can obtain the same result 
directly jfrom the fact that the total probability of finding the system 
in some state must always be equal to unity as given by (68.9). We 
can then write at once. 


W m = 1 - V IF 12 4siiig{(7r/A.)(.E°.-.B°)t} 
»nK) ^ l^mni 


(68.13) 


for the probability of finding the system stiU in its original state n, in 
the neighbourhood of the time i == 0, the summation being over all 
states m ^n. 

In accordance "with the expression given by (68.12), we see, on 
account of the appearance of the so-called ‘resonance denominator’ 
(j®^— jE?^)®, that there will be an appreciable tendency for excitation 
of a quantum state m only when the corresponding unperturbed energy 
E%^ lies close to the unperturbed energy E% of the initial state n. Exact 
equality is not demanded since E% and. E^ ace not true energy levels 
for the actual system. Starting the ^stem off at £ = 0 in the state n 
would correspond to a distribution over the various possible values 
of the true energy, with a relatively high probability of finding values 
close to if a measurmnent were made; and this relatively high 
probability of finding such values would be maintamed in time in 
accordance with the quantum mechanical analogue of the principle of 
the conservation of energy as discussed in §63 (c). 

The foregoing shows the possibiliiy of integraling the differential 
equations (68.6) for the coefficients CjJify that determine the state of the 
system, provided that we start the system off in a qiecially simple 
state with one of these coefficients cjf) equal to unity, and that we 
make use of a simple method of approximate calculation. It is evident, 
however, that at least in principle we could give a similar treatment 
starting the system off in any desired state, and that by successive 
approximations we could achieve any desired accuracy in the cal- 


culation. 

This must now complete our account of the general features of the 
quantum mechanics. 



vm 

SOME SBIPLE APPLICATIONS OP QUANTUM MECHANICS 


69. Simple one -dimensional solutions 
In the preceding chapter we have developed the fondamental prin- 
ciples and general methods for a non-relativistic quantum mechanics. 
In the present chapter we shall give a brief and partial account of some 
simple applications. These will be chosen either in order to illustrate 
the general nature of the quantum mechanical treatment of the atoms 
and molecules composing the systems of usual interest in statistical 
mechanics, or in order to furnish material specifically needed for our 
later statistical studies. The results to be obtained in the chapter are 
all well known, and for the most part the treatment will be given in 
a condensed form with references to more complete treatments. The 
results of most immediate interest for statistical mechanics are ob- 
tained at the end of the present section, where we discuss the correlation 
between quantum mechanical states and extensions in the classical 
phase space, in §71 where we discuss the number of eigensolutions lying 
in any given energy rai^e for the case of a particle in a container, in 
§ 75 where we discuss the effect of spin on the number of eigensolutions 
for a particle, and in § 76 where we discuss the enumeration of eigen- 
solutions for systems composed of similar particles. The treatment in 
these places will be somewhat more complete. 

(a) Solutions in regions of constant potential. In the present section 
we shall consider solutions of the Schroedinger equation which can be 
regarded as corresponding to the one-dimensional motion of a particle 
of ms^ TO £dong the x-asds. We shall first treat the simple case of 
solutions in regions of ccmstani potential V (x). Such solutions often make 
it possible to obtain considerable insig ht into the nature of quantum 
mechanical results with a Tninimnm of computation. 

Since any solution ^(x,t) of the general Schroedinger equation (57.5) 
can be expressed, in accordance with the methods of § 66 (c), in the form 

0(x, = (69. 1 ) 

as a superposition of steady state solutions each multiplied by a suitable 
coefficient it will be sufficient to investigate the solutions of the 
time-free Schroedinger equation 

Htt(x) = Eu{x), 


(69.2) 



§69 


ONE -DIMENSIONAL SOLUTIONS 


279 


which provide the different eigenfunctions %(a:) and energy eigenvalues 
Ek to be used in making such superpositions. 

The quantity E in the above equation is some particular possible 
eigenvalue of the energy, and the Hamiltonian operator H will be given 
in agreement with (56.6) by 




where for our present simplified considerations the potential V{x) will 
be taken as a constant independent of x, except for the possibility of 
sudden changes in value in going from one region of constant potential 
to another. Substituting (69.3) in (69.2), we then obtain 


dhi 


(E—V)u 


(69.4) 


as the equation for determining the eigensolutions corresponding to 
different allowable values of the energy E. 

This equation has two types of solution according as the constant 
E is greater or less than the constant F. 

K the actual energy E is greater than the potential V, the solution will 
be trigonometric and can be written quite generally, either with the 
help of sines and cosines in the form 


u = Asm—jJ{2m{E—V)}x+Bcoa-j-^{2m.{E—7)}x, (69.5) 

tlf ih 


or, with the help of imaginary exponents, in the form 


(69.6) 


where A and B or A' and B' axe the pair of arbitrary constants that 
correspond to the second-order character of the original differential 
equation (69.4). The oacHlatory nature of these solutions is charac- 
teristic for particles with a total energy E greater than the potential F, 
and is in agreement with the use of the name wave mec^niea as an 
alternative for quantum mechanics. 

These two differmit forms, in which an eigensolution for the case 
E >7 can be written, are of course entirely equivalent and can be 
transformed into each other. The first form (69.5) is, however, the con- 
venient one to take when we wMi to use such dgensolutions in making 
a superposition of steady state solutions (69.1) to correspond to a situa- 
tion where there are equal probabilities of finding the pmticle moving 



280 SOME SIMPLE APPLICATIO^fS OE QUAOTXJM MECHAOTCS Chap. Vni 


either in the positive or negative a:-direction, e.g. a particle between 
reflecting walls. And the second form (69.6) is the convenient one to 
take in dealing with situations where there can be a net probability 
for motion in a given direction, e.g, a particle which has chances of 
reflection or transmission at a place where there is a sudden change in 
potential from one constant value to another. 

To see the reason for this difference in the two forms of solution, we 
note, in accordance with our previous espression (58.4), that the proba- 
bility current corresponding to any stationary state solution, of the 
form t) = would be given by 


■* Sx ^ 8x} 47rtm\ 8x 8x ) ^ ^ 

This at once shows, however, that the first form of solution (69.6) will 
be convenient for situations where there is no preferential flow, since 
with the constants A and B both chosen as real the probability current 
would be zero, /S* = 0. On the other hand, the second form of solution 
(69.6) will be convenient for situations where this is a net flow. Thus, 
for example, with the constant jB' set equal to zero, we readily find 
from (69.7) ifotTp t 7 U 

8^ = \A ’\^ -■) = (69-8) 

where v is the classical expression for the velocity of a particle with 
total energy E and potential energy F. 

Turning now to a consideration of equation (69.4) for the case when 
the actual energy E ia less than the potenMal 7, we see that the solution 
will then be exponeniiai in character and can be written in the general 

(69.9) 


where C and D are now the pair of arbitrary constants corresponding 
to the second-order character of the original differential equation. It 
is specially noteworthy that this solution contains no implication that 
the probability u*udx for finding the particle in a region where the 
total energy E is less than the potential F must be zero. Indeed it is 
a characteristic feature of the quantum mechanics that particles can 
penetrate and under suitable circumstances actually traverse classically 
forbidden regions. 

Althou^ the above forms of e^ensolution only hold in regions where 
the potential F is a constant, it is nevertheless possible to allow sudden 
change in the potoatial, from one constant value, say holding over 



§69 


ONE-DIMENSIOiTAL SOLUTIONS 


281 


a given range of x to another value, say V 2 . holding in a neighbouring 
range, and then connect the appropriate expressions for the forms of 
solution in the contiguous regions. In this way, introducing a succession 
of stepwise changes in potential if necessary, it is possible to get an 
approximate representation of many kinds of potential field and thus 
to treat a wide variety of problems. 

In passing from one region of constant potential Ti to another region 
of constant potential Pg, it is evident firom the form of our differential 
equation for the eigenfunctions (69.4) that there will be a sudden change 
at the division point in the value of the second derivative d^ujdx^. For 
the eigenfunction u itself, however, and for its first derivative d-u/dx it 
is evident that we can still require continuity at that point. Indeed, 
if we did not demand this measure of continuity it would be possible 
to construct solutions such that the probability density and probability 
current would have to correspond to a source or sink at the division 
point, which would in general be physically unallowable. 

The requirement of continuity for u and dufdx at the boundary 
between any pair of regions is sufficient to give the proper connexion 
between the forms of solution holding on the two sides of a boundary 
where a sudden change in potential takes place. For example, if to the 
left of the point a; = 0 we have an expression of the form (69.5) holding 
with the constant potential and to the right an expression of 

the form (69.9) with the constant potential > £?, it is evident that 
the constants for the two forms of expression would be connected 
by the relations ^ _ GA-D 

(69.10) 

and = (0-i))V(72-i?). 

And similarly the connexion between expressions of the forms (69.6) 
and (69.9) holding on the two sides of a; = 0 would be given by 


and 


= G+D 

= {G-DW,-E). 


(69.11) 


In both cases it will be seen that the number of independent constants 
which can be arbitrarily chosen to describe the solution will be only 
two, in agreement with the second-order character of the original equa- 
tion (69.4), which governs the solution over the whole range in x. 

Making use of the indicated methods of treatment, insight as to the 
nature of quantum mechanical results can be obtained in a considerable 
variety of different specific situations. 

Thus it is easy to show that an oscillatory expresrion such as (69.6) 

3S95.SS Q Q 



2S3 SOME STMPT.TC APPLICATIONS OP QUANTUM MECHANICS Chap.VHl 

with A and B real holding in a region where TJ is less than B will always 
be connected with a diminishing expcfnenticd expression in a neigh- 
bouring region where remains perrmnently greater than E, so that 
there will be an ever-decreasing probability of finding the particle as 
we proceed farther and farther into the dassically forbidden region. It 
can also be shown, for the limiting case where the potential goes to 
infinity, that the solution for u will fall to zero at the boundary and 
that there will be no probability of finding the particle outside the 
cla^caUy permitted region; this situation can be regarded as providing 
a model for a perfectly refkcling toall. 

For the case of a particle in a region where is less than E, bounded 
on each side by regions where the potential remains permanently higher 
than E, it can be shown that the conditions for a diminishing exponen- 
tial form of expression at both sides can only be met with discrete 
values for E. By treating this so-called problem of the quantization 
of a particle in a trough, we thus obtain a simple illustration of the 
occurrence of discrete energy spectra in the quantum mechanics. 

Studies can also be made of the behaviour of particles when regions 
in which the potential is less than E are connected with regions of 
only limited exteni where the potential Tg is greater than E. In this 
way considerable understanding can be obtained for the phenomena of 
penetration Oirough potential barriers in general, of resonance penetration, 
and of weakened qmntizaiion for a particle between two such barriers.f 

(6) Approximate solutions in regions of varying potential. In case 
a more accurate treatment of one-dimensional problems is needed than 
is provided by the above possibility of representing any actual dis- 
tribution of potential by a series of constant potentials, use can be made 
of a more closely approximate solution of the Schroedinger equation 

^ = j^{E-Y{x)ya ( 69 . 12 ) 

which has been specially studied by Wentzel, Kramers, and Brillouin. 
For r^ons where E>V {x), this solution can be written in the trigono- 
metric form 

/ VfM*-F)} & + 

( 69 . 13 ) 

t For a specially thoioii^ and valuable study of the method^ see Scbroediuger, Beri. 
Ber*p Fhy^.-Maih, Kkuae, 1929, p. 668. 



§69 ONE-DIMENSIONAL SOLUTIONS 

and in regions where E < V{x) in the expmential form 


283 


“<*> = ?pS(F=jr}‘ 


D 


%/{2m{V-E)} 


42>n(r-JB)}dr 
(69.14) 


where A, B, G, and D are constants. By substituting these expressions 
into (69.12) it will be seen that they do furnish a good approximation 
provided \E—V\ is large enough or 137/6*1 and \S^7fdx^\ are small 
enough. This condition makes the present treatment inapplicable to 
discontinuous changes in potential such as just treated in § 69 (o). The 
solutions fail in the neighbourhood of points where E =7, and we 
have a transition between the two forms of expression. If at a point 
x = awe pass from an 'exponential region’ where E <7 to & ‘trigono- 
metric region’ where £? > 7, it is foimdf that exponential solutions 
holding to the left of the point a can be satisfactorily connected with 
trigonometric solutions to the right of the point a in the Tnanriftr in- 
dicated by the abbreviated formalism 


C 

V(-) 



]-K-)dx 


< > 

a:*a 


2C 

V(-) 



dx — Jw}, 


(69.15) 


G' 

V(-) 


2ir 





dx — iw], 


(69.18) 


and, if at a point a: = 6 we pass from a region of trigonometric to one 
of exponential solutions, the coimexions may be expressed in the form 

J V(-) ^ -i-j 


m-) 


where the G’s are arbitrary constants. This provides all that is neces- 
sary for nwing the so-called W.K.B. approximate method of solution 
for tiie general one-dimensional wave equation.^ 

It will be of interest to apply the method to the problem of the 
quantized behaviour of a partide in a region a < * < 6 where E >7, 
bounded on both sides by regions — oo < a: < o and b <x <co where 

t See, for example, Panli, Hcmdbuch der Physik, zziv/1, second edition, p. 171 
t For another me^od of approximation, which is peihaps even more satisfactoiy, 
see Langer, Phys. Rev, 51, 669 (1937). 



284 SOME SIMPLE APPLICATIONS OF QUANTUM MECHANICS Chap.VIII 


E a V over the entire range. From a classical point of view the particle 
would be strictly confined between the points x = a and a; = 6, since 
its total energy E would be less than the potential energy F needed 
for positions outside that range. In the quantum mechanics, on the 
other hand, some penetration of the particle into the classically for- 
bidden regions would be possible, since the probability density would 
not go abruptly to zero on passing the points a and 6. Nevertheless, 
in order to prevent the probability density from going to i n fin ity at 
X = ± 00 , it is evident that we can only permit descending exponential 
solutions outside the points a and b. Hence we are restricted in the 
present problem to solutions of the forms (69.15) and (69.17), and 
wthin the range a to b must have 


C'cos 


f27r 

\T 


X 

j ^!{2miE-V)}dx -in 



dx —in 
(69.19) 


holding for all pointa x. It wiU be readily seen, however, that this 
relation can be satisfied for arbitrary values of x only if we have 

b 


^ J ^{2m{E-V)} dx = nn+in 


(69.20) 


and 


C = (-l)”^, 


where » is an integer. 

The first of these expressions may be rewritten in the forms 


» 

2 J A^{2m{E—Y)} dx= ^ pdx— (»+ J)A, (69.21) 

a 

where p denotes the classical momentum of the particle and the integra- 
tion is over a complete period for the classical motion of the particle 
firom a to b and back. In the case of a system of more than one degree 
of fireedom, such that the wave equation corre^onding to (69.12) can 
be treated by the method of separation of variables, the result given 
by (69.21) can be applied separately to the coordinate and momentum, 
qi and pt, for each degree of freedom that corresponds dassieally to an 
osdllatoiy motion. The approximations involved in obtaining (69.21) 
become n^Iigible at sufficiently large values of n. 

The result given by (69.21) is of interest in showing the cozmexion 
between the older quantum theory and the present qua/atam mecJumics. 
In the older quantum theory the allowed values of the energy for quasi- 



§69 


ONE-DIMENSIONAL SOLUTIONS 


285 


periodic steady states of motion were to be obtained, in accordance 
with the Wilson-Sommerfeld rule of quantization, by setting the phase 
integrals J pi dq^ equal to Planck’s constant Ji multiplied by an integral 
quantum number n. In the quantum mechanics we now see that a 
better approximation is obtained in oscillatory situations by taking 
half-integral quantum numbers n+i, (Cf. §72, eq, (72.3).) However, 
in simple one-dimensional rotational situations, not involving particle 
spin, we shall still find integral quantum numbers. (Cf. §73, remarks 
in connexion with eq. (73.27).) 

In accordance with the result given by (69.21) we can regard each 
of the successive energy eigenstates, obtained by taking successive 
values of the integer n, as correlated with the classical states which 
would lie in a region of the phase space of approximate area h. This 
is an illustration of our previous remark, end of § 20, that a region in 
the phase space of magnitude in the case of a system of / degrees 
of fi:eedom, could be regarded as an approximate classical analogue for 
a precisely defined quantum mechanical state. We shall note further 
illustrations of this correlation in the course of the pr^nt chapter. 

The volumes of the phase space to be associated with specified 
quantum mechanical states can be taken as assuming the precise values 
hf, as we approach the correspondence principle limit where the quan- 
tum mechanics and classical mechanics lead to concordant conclusions, 
i.e. in the above case as we go to large values of n. This will be of 
interest in a later chapter, § 84, when we discuss the agreement, at the 
correspondence principle limit, between the basic hypotheses for the 
classical and for the quantum statistics. 

70. Particle in free space 

We may next turn to a consideration of the behaviour of a free 
particle in three-dimensional space. Since any solution of the complete 
Schroedinger equation can be expressed as a superposition of 

steady state solutions, 

0 = 1 (70.1) 

k 

we may first consider the nature of the eigenfunctions %($). These 
will be solutions of the time-firee Schroedinger equation 

Hit = Ell, (70.2) 

where is a particular eigenvalue of the energy and H is the Hamil- 
tonian operator. 

For the case of a particle in free space we may talce the potential F 



28« SOME RTAn»T.E APPLICATIONS OP QUANTUM MECHANICS Chap.Vni 

as a constant, ■which 'we can set equal to zero ■without loss of generality. 
Using a coordinate represen'fcation corresponding to ■the Cartesian co- 
ordinates ar, y, s, the Hamiltonian operator then assumes the simple 


form 


H = 


2m 


(pI+pS+pI) = 


’ ^hn\8z^'^ dy^'^ 


And substituting into (70.2), ■we 

(70.3) 


where m is the mass of the particle, 
obtain g 2 ^ g% ^ 

as the equation for determining the eigensolutions u{z, y,z) correspond- 
ing to different values of E. 

A general solution of this equation can e^vddently be ■written in the 
form u = C sm(aa:-l-a)sin(6y-)-j8)sm(cz-|-y), (70.4) 

where a, b, c, a, y, and C are constants, pro^vided the first three of 
these constants are connected by the relation 


a^+b^+c^ = ^E, 


(70.5) 


thus reducing the number of independent constants to six in agreement 
with the form of the original equation (70.3). 

A solution of the form (70.4) is, however, not very convenient for 
studying situations of any great physical interest, since taking the con- 
stants occurring in the sine terms as real we see that it then merely 
corresponds to a uniform sinusoidal distribution of the probability 
density ii*u for finding the particle somewhere in space, and to a zero 
value for the probability current S in any direction. 

A form of solution which is better adapted for giving insight into 
situations of physical interest is obtained by considering solutions 
which are simultaneously characteristic of the three components of 
momentum pyy p^ and of the energy E, Since the operators for all 
of these observables commute with each other it is evident, in accor- 
dance with the discussion of § 64 (d), that such solutions should be pos- 
sible. As the four equations which must be satisfied by such a solution 
we evidently have 

or, making use of a coordinate representation, 
h du h du 


h du 



287 


§ 70 PARTICLE IN FREE SPACE 

As a solution satisfying all of these equations, we then have 


2iTif 


u(x,y,z) = 


(70.7) 




with 
and C a constant. 

Multiplying (70.7) by the appropriate exponential time factor, we 

%sii. 


obtain 


^{x,y,z,t) = (70.8) 

as the corresponding time-dependent solution. And introducing the 
wave numbers cr^, cr^, and frequency v defined by 


Px = Py = Pz = ^ 

we are once more led to the de Broglie waves for a free particle 

2irf((raa:+Oy 


(70.9) 


(70.10) 


ijt{x,y,z,t) = Ce‘ 
already obtained in § 61. 

Solutions of the form (70.8) may be superposed to give wave packets 
which will give an approximate representation of the kinematical be- 
haviour of a single particle as already studied in § 62. They may also 
be used directly to represent an infinite stream of aimilar non-interacting 
particles of a given energy, having the uniform constant probability 


density 


W = ^*ijf= \0\^ 


(70.11) 


(70.12) 


and the constant components of probability current 
* 4firvm\ Bx ^ dxj m 

This latter use is valuable in the study of problems involving colhsion 
with a scatterer, where a solution in Cartesian coordinates to represent 
the oncoming particles can be combined with a solution in polar co- 
ordinates around the scatterer to represent the deflected parlioles. 


71. Paiilcle in a container 

(a) The energy eigenvalues and dgenfimctions. We now turn to a 
consideration of a particle inside a contaiiner rather than completely 
firee in space. This will be a first step towards our later consideration 



288 SOME SIMPLE APPLICATIONS OP QUANTUM MECHANICS Chap.Vm 


of systems, oonunonly treated in statistical mechanics, such as an 
assembly of gas molecules in a vessel. 

Since any solution of the Schroedinger equation for such a system 
can be regarded as given by a superposition of eigensolutions %(?), 
corresponding to the different eigenvalues Ej^ for the energy of the 
particle, i.e. by ^ (71.1) 

we may at once tnm our attention to these eigensolutions. As the 
equation determining these eigensolutions we have, in agreement with 

H«(g) = Eu{q). (71.2) 

The quantity E occurring in this equation is some particular possible 
eigenvalue for the energy of the system, and the Hamiltonian operator 
H for a single simple particle of mass m will evidently be given in 
agreement with (56.5) by 

H = ^(pS+p5+pI)+^(®»2'»*) 


A* ^ , 

8iAn \da^~^ 



(71.3) 


where x, y, and z are Cartesian coordinates for the position of the 
partude and V is the classical expression for the potential energy of 
the particle as a function of position. Substituting (71.3) in (71.2), we 
then obtain 


dhi dht dhi 




{E—Y)u 


(71.4) 


as the equation for determining the different eigensolutions u{x, y, z) in 
question. 

For simplicify let us now consider that the container for the particle 
is a rect an g u lar box having one comer at the origin of coordinates 
» = j/ = z = 0, and having the length, breadth, and height x = Z^, 
y = Zj, and z — Inside the box the potential F would be a constant 
and this may be conveniently taken as the starting-point for energy 
measuremmits, so that we may put F = 0 inside the container. On the 
other hand, at the walls of the container, a; = 0, » = Z^, etc., we shall 
take the potential as suddenly increasing to infinity in agreement with the 
peaffeeUy refledifig character which we shall wish to ascribe to the walls. 

Taking F = 0 inside the box, we may then evidently write as a 
general solution of (71.4) in the interior the same esqpression that we 
found for the case of the fi:ee particle 

tt = C'sin(aa:-f a)8in(6y-j-j3)8in{cz-|-y), (71.6) 



§71 


PARTICLE I3Sr A CONTAINER 


289 


where a, b, c, oc, jS, y, and C are constants, the first three being con- 
nected by the relation 

024-62+0® = (71.6) 

thus reducing the number of independent constants to six, as would 
be expected from the form of (71.4). 

We must next consider the boundary conditions which would be 
imposed on this solution by the presence of the walls of the box. Since 
by hypothesis V goes to infinity at the walls, it is evident from the 
form of (71.4) that we must require u itself to be zero at the walls, 
since otherwise its second derivative would become should 

have an overwhelming probability of finding the particle outside rather 
than inside the box. To obtain the result = 0 at a? = 0, y = 0, and 
2 = 0 we must set the constants 


a = ^ = y = 0, (71.7) 

and to obtain that result &t z = y = Zj, and a = Z3 we must take 


7l*> IT 


the constants 

a = ^^, b = ^, c = (71.8) 

^2 ^3 

where Wg, and are integers. 

Under these circumstances our eigensolution (71.5) reduces to the 
form 

« = Osin^sinMsm^ (71.9) 

^2 ^3 

and the corresponding eigenvalue of the energy has — ^in accordance 
with (71.6) and (71.8) — ^the value 

E = (7110) 

8ot\^ + Z| + Z|/- 

The solution can be normalized to unity by taking the constant 


We can thus obtain a complete set of normalized orUiogonal eigen- 
functions corresponding to different choices of the 

quantum numbers %, n^, and n^, and to the different possible discrete 
values of the energy In case any two of the lengths Zj, Zg are 

commensurable, degeneracy will occur since different eigensolutions 
would then correspond to the same value of the energy. Having ob- 
tained the set of eigenfunctions, any solution for the state of the system 

3596.25 p p 



290 SOJEE SIMPLE APPLICATIONS OF QUANTUM MECHANICS Chap. VIII 


can then be expressed as a superposition of such eigenfunctions each 
multiplied by a suitable coefficient and exponential time factor. 

(6) The number of eigensolutions in a given range of energy. In our 
later statistical mechanical work we shall be specially interested in the 
number of different eigensolutions or steady quantum states, having 
eigenvalues of the energy lying within a given range, say E to E-\-LE. 

To determine this number let us consider a system of Cartesian 
coordinates with axes for plotting the values of the three quantum 
numbers 71^, n^, n^, and let us construct a cubical lattice consisting of 
points whose coordinates are integral values of these quantities. Each 
such point with positive values of n^, n^, and will then correspond 
to a particular eigensolution or quantum state of the system. Further- 
more, in the interior of such a lattice there will be one unit cube for 
each lattice point. Hence, if F is not too small, we can calculate the 
total number of quantum states corresponding to values of the energy 
less than E by considering the volume 

F = f f r drtidn^dn^ (71.12) 


over the range of quantum numbers given by 

, n| , n| 8mE 
A® ’ 


(71.13) 


which, in accordance with (71.10), will include aU quantum states 
with energy eigenvalues less than E. From the known formula (see 
Appendix II) for the volume of an ellipsoid this will give us 


V 





(71.14) 


where v the spatial volume of the contamer for the molecule. 

And since the positive values of the quantum numbers with 

which we are concerned are limited to one octant, we shall then have 




(2mF)* 

4 ^® 


(71.15) 


as the number of eigensolutions or quantum states with eigenvalues of 
the energy less than E. By differentiation we then obtain 

Q = ^m^l{2mE) AE (71.16) 

as the desired expression for the number of eigensolutions or quantum 
states Q corresponding to the energy range E to E+AE. In case the 



§71 


PARTICLE IN A CONTAINER 


291 


particles have a spin, which will be treated in § 75, the number of such 
states will be twice as great: 

0 = ^ AE. (71.17) 

We shall find these results very important for our later statistical 
considerations. By comparing the number of eigensolutions in the 
range AE as given by (71.16) with the volume of the phase space 
4frTvm^{2mE) AE, which would correspond classically to that range, we 
see for large quantum numbers that we can regard each eigensolution 
for this system of three degrees of fireedom as correlated with an exten- 
sion in the phase space of magnitude A®. This is an example of the 
general possibility for such correlations as discussed at the end of 
§69(6). 

72. Particle in a Hooke’s law field of force 

As our next simple mechanical system we shall consider a particle 
in a potential field F = iw®, which from a classical point of view would 
give us a linear harmonic oscillator. This will also be a step towards 
the treatment of systems of physical interest since the oscillations of 
actual atomic systems can be treated with some degree of success by 
similar methods. 

In accordance with the suggested form, F = hx^, for the potential 
function, the time-free Schroedinger equation 

H« = Eu, (72.1) 

which determines the energy eigenvalues E and eigenfunctions u for 
the problem, can be written as 

= 0. (72.2) 

The eigenvalues of E which will lead to solutions of this equation, 
such that u{x) does not go to infinity as z goes to infinity, are found 
to be given by the simple formula 

= (»+i)VW2’^) = (« = 0, 1, 2, 3,...), (72.3) 

where v is an abbreviation for the classical frequency with which such 
a particle would oscillate. PurthOTmore, the complete set of normalized 
orthogonal eigenfunctions, which are solutions of (72.2) for the different 
values of n, are found to be expr^sible in the formf 

= (72.4) 

t For a detailed derivation of this solution of (72.2), together with a table of Herznite 
polynomials see, for example, Pauling and Wilson, Introduction to Quantum 

Mechanics, New York, 1935, pp. 67-81. 



292 SOME snrPT.E APPLICATIONS OF QUANTUM MECHANICS Chap. VIII 


where for simplicity we have made the substitution 



the Hermite polynomials are defined by 






(72.6) 

(72.6) 


and the normalizing factor leading to the value unity when the 
square of an individual eigenfunction is integrated over all values of 
the original variable x, is given by 



These results are of interest in eormexion with the oscillatory be- 
haviour of diatomic molecules, and the behaviour of the modes of 
vibration by which we can represent the thermal energy of an elastic 
solid or the electromagnetic energy of the radiation in a hollow en- 
closure. In connexion with (72.3), it is specially interesting to note 
the appearance of ‘half-quantum numbers’ in disagreement with the 
integral quantum numbers prescribed by the older quantum theory. 
In accordance with this result, the lowest oscillatory quantum state 
would exhibit a so-called zero-point energy \hv. The superiority of the 
new result in the case of diatomic molecules was first observationally 
demonstrated by the work of Mulliken.t 
By comparing the expression for energy given by (72.3) with the 
classical expression E^{2Tj^lk) for the area in the phase space that 
would correspond classically to values of the energy less than E, we 
see for large values of n that we can regard each eigensolution for this 
^stem of one degree of freedom as correlated with an extension in the 
phase space of magnitude h. This is another example of the general 
possibility for such correlations as discussed at the end of § 69 (6). 


73. Particle in a central field of force 
As our next mechanical system we shall take a simple particle in 
a central field of force. In such a field of force the angular momentum 
around any axis — if it has a definite value at some initial time — will 
remain fixed, in accordance with the treatment of § 63 (d), so long as 
the system is not disturbed; and we shall be interested in knowing the 
possible eigenvalues which the angular momentum can assume. We 
shall commence by considering some general properties of the operators 
t MuFiken, Phys. Pev. 25, 259 (1925)- 



§73 


PARTICLE IN A CENTRAL FIELD 


293 


for the components of angular momentum which are of interest without 
reference to the kind of force field acting on the particle. 

(a) Operators for the components of angular momentum. If we con- 
sider a particle with the Cartesian coordinates r, z and corresponding 
components of momentum py, the classical expressions for its 
components of angular momentum around the x, y, and z coordinate 
axes would be 

= yPz-^JPvr 

(’ 73 . 1 ) 

K = ^Py-ypx^ 

Making use of our rules for obtaining the corresponding operators 
suitable for use in coordinate language, we at once obtain as the corre- 
sponding operators hid " \ 


“«=si(4-4)' 


( 73 . 2 ) 


where no ambiguity arises as to the order of factors, ance non-conjugate 
coordinates and momenta eommute.f Furthermore, by squaring and 
adding we obtain for the operator for the square of the total angular 
momentum with respect to the origin 
M2 = M2+Ma+Mf 


A 

27rt, 


^ g 2 32 ga g 2 




By- 


i+ 




dx^ 


■ 2zx 


a2 


8zdx 


■ x‘ 


^^2x~2y~2z^). (73.3) 
0*2 8x ‘'by 8z} ’ 


For some purposes it is more convenient to express these operators 
in the language of a set of polar coordinates r, 0, ^ which can be intro- 
duced in accordance with the equations 

» = rsiagcos^, y = rsmdsaxitft, 2 = rcos0. (73.4) 

■f Treating these operators (73.2) as the components of a vector M, they have the 
important property that the changes in any scalar function of position under an infini- 
tesimal rotation 3co may be simply expressed in terms of them by 

SF(x, y>z)= — ®)» 

where (Sco.M) is the inner product of the two vectors. See for this, and its co nnexio n 
with the commutation rules (73.7) and (75.1), Pauli, Handbuch der Phyaik, xxiv/l, 
second edition, pp. 176 ft. 



294 SOItlE SBIPLE APPLICATIONS OF QTJANTXJ^I MECHAOTCS Chap. VIII 


By a straightforward but somewhat tedious process of substitution the 
above operators are then found to assume the forms 




h 8 

M, = 

2jrt c<p 


M>=(An_L 1/^91.' 

[tml (sm8 8«i 89; 


”^sin20 


(73.5) 


(73.6) 


It win be of interest, also in connexion with our later introduction 
of the spin of the electron and other fundamental particles, to con- 
sider the commutation properties of the above operators. It will be 
readily found that the operators for the different components of angular 
momentum do not commute with each other, but have the commutators 




(73,7) 




On the other hand, it will be seen that the operator for the square of 
the total angular momentum with respect to the origin commutes with 
any of the operators for the component of angular momentum around 
a particular axis. 

In accordance with the above, we may then conclude that it would 
not be possible for the components of angular momentum around two 
different axes to exhibit precisely defined values at the same time, 
except in the special ease when all components, and thus M^, vanish. 
On the other hand, however, it would be possible for the square of the 
total angular momentum with respect to the origin and the component 
around any selected axis to exhibit simultaneously defined values. 

{b) The eigenfunctions and eigenvalues corresponding to angular 
momentum. We may now inquire into the eigenfunctions and eigen- 
values corresponding to a component of ai^ular momentum around 
a particular axis and to the total angular momentum with r^pect to 
the origin. 

let us first consider the angular momentum around some definite 



§73 


PARTICLE IN A CENTRAL FIELD 


295 


direction wl^ich we may take as the z-axis. Taking the variables r, d, 
and the eigenfunctions •u{r, 6, j>), corresponding to an eigenvalue 
for the z-component of angular momentum, would then he given, in 
accordance with (64.2), by 

M.«(r, 0, <f>) — 2I,u{r, 6, ^). (73.8) 

Substituting the expression for the operator M. given by (73.6), this 
gives us the equation 

ii 

for determining the desired eigenfunctions. 

This equation can easily be treated by the method of separation of 
variables by assuming a solution of the form 

u{T, e, <f>) = v{r, (73.10) 

where the function of ^ alone must evidently satisfy the equation 

A^o(« = J4«.w. (73.U) 

And this has the general form of solution 

0(^) = (73.12) 

where (7 is a constant. 

For such a solution, however, to be a satisfactory constituent of an 
eigenfunction u{r, 6, <f>), which could be used in predicting the probability 
of finding the particle at any point r,0,^, it will be appreciated that 
<1>(^) must have the same values for ^ and Hence the con- 

stituent eigenfunctions ^>(9^) must have the special form 

0(^) = C7e*W, (73.13) 

with m 0, il, i2, i3,... , (73.14) 

and with M^ = (73.15) 


where m — ^not to be confused with the mass of the particle — ^may be 
called the quantum number for the allowed eigenvalues of the 
z-component of angular momentum. The constant G may be given 
the value 

C = 

in case it is desired to normalize to unity for iat^ation over the range 
0 to 2ir. 

Turning now to the total angular momentum with respect to the 
origin, we shall have the characteristic equation 


•^{27t) 


(73.16) 


(73.17) 



296 SOME SIMPLE APPLICATIONS OF QUANTUM MECHANICS Chap. VIII 


Substituting the expression for tbe operator M® given by (73.6), and 
the solution characteristic of I 4 provided by (73.10), we can then obtain 
solutions simultaneously characteristic of total ai^ular momentum 
squared and its z-component from the equation 

“ MHir.em). P3.18) 

This equation can also evidently be treated by the method of separa- 
tion of variables if we write the solution in the form 


«(r, 6 , 4 ,) = vix, em4>) = (73.19) 

which, by substituting the specific expression for <1>(^) given by (73.13), 
will require 

ri / n/^//l\\ _..9. /0_\2 

0(fl)4.(:^ M^QiO) = 0. (73.20) 


_1 
sinflcfl 


-^(sin 0 

cd\ 




80 (g) l 

dd j sia®0' 


The allowable solutions of this latter equation can be shown to be of 
theformt 9(0) = CFT(cos0), (73.21) 

with I = \m\, |ml-f 1, , (73.22) 

and with the allowed eigenvalues for the square of the total angular 
momentum /i \2 

’ ('^3.23) 

where Pj.(eoe9) = ^smW9?^j^^' (73.24) 


is the associated Legendre function of order m and degree I, and I may 
be called the quantum number for total angular momentum. The con- 
stant O can be given the value 


a/ 2(Z-Hlm|)! 


(73.26) 


in case it is desired to normalize to unity when the square of (73.21) is 
multiplied by sinddd and integrated from 0 = 0 to n. 

Returning to (73.19), we may now write 

^(r, 0, 4>, <o) = u(r, 0, 4>) = J?(r)Pf (cos 0)e^”d‘ (73.26) 

as an expression for the probability amplitude giving a state of the 
system at a time such that the particle would exhibit the component 
of angular momentum aroxmd the z-axis 

(m = 0, ±1, ±2, ±3,...) (73.27) 


t For a detailed derivation of this solution of (73.20), together Trith a table of 
normalized Xiegendre functions PJ*(cos 0), see, for example, Pauling g-rtrl Wilson, Intro- 
dudion to Quantum Mechanics, New York, 1933, chap. v. 



§ 73 PARTICLE IN A CENTRAL FIELD 297 

and the total angular momentum with respect to the origin 

(Z = \m\+%...). (73.28) 

Comparing (73.27) with the classical expression §iLd<f>^ 2n-J14 for 
the phase integral corresponding to rotation aroimd the s-axis, we see 
in this simple situation that the successive eigenvalues of Jil would be 
determined by the original Wilson-Sommerfeld rule of setting the phase 
integral equal to Planck’s constant Ji multiplied by an integer. And 
noting that (73.28) would give eigenvalues for the absolute magnitude 
of the total angular momentum which would approach [Jf j = Ihj^TT at 
large values of I, we see that the successive values of \M\ would then 
be separated by the amount demanded by the Wilson-Sommerfeld rule. 
In both cases we see, for large quantum numbers, that we can regard 
the eigensolutions for the single degree of freedom in question as asso- 
ciated with an extension in the classical phase space of magnitude h. 
This is still another example of the general possibility for such correla- 
tions as discussed at the end of § 69 (6), this time for angular momentum 
eigensolutions instead of for energy eigensolutions. 

(c) Steady states in a central field of force. So far we have made no 
assumption which would require our particle to remain permanently 
in any particular eigenstate of angular momentum. We may now intro- 
duce the assumption that the field of force, which would be acting on 
the particle from a classical point of view, is spherically symmetrical 
around the origin of coordinates. We shall then be able to find steady 
states corresponding to permanently fixed eigenvalues for aH three 
quantities — component of angular momentum total angular mo- 
mentum squared and energy E, 

Assuming such a radial field of force, the Hamiltonian operator can 
then be written in the form 

H = ^(Pl+P5+Pl)+F(r) 

(73.29) 

where V{r) is the spherically symmetrical potential field, and we have 
introduced in the last form of writing the known expression for the 
Laplacian operator in spherical coordinates. 

3595.35 Q q 


(1 df ^d\ ^ 1 a/. 

/ — ij isir 

dr\ dr] ^ sin B 39 \ 



298 SOME SIMPLE APPLICATIONS OF QUANTUM MECHANICS Chap.Vm 


It Tnll be seen on examination, however, that this Hamiltonian 
operator commutes both with that for the ^-component of angular 
momentum given by (73.5) and with that for the total angular momen- 
tum squared given by (73.6). Under these circumstances we can then 
conclude in the first place, in accordance with § 64 (d), that we can have 
states simultaneously characteristic of all three quantities M^, and 
E\ and can conclude in the second place, in accordance with (63.3), 
that these angular momenta will then be conserved with time so long 
as the system is left undisturbed. 

To examine such states, permanently characteristic of all three 
observables, we have 

Htt(r, 6, <f>) = Eu{r, 6, <f>) (73,30) 

as the general equation for a steady state of constant energy E. And 
substituting for the operator H the expression given by (73.29), this 
can be written in the specific form 

+^{E-V(r)}uM^) = 0. (73.31) 


For a(r, B, 4>), however, we may now substitute the specific form 

u{r,d,4>) — E{r)Q(6)0{<f>) (73,32) 

already made characteristic for the angular momenta and M^. 
Doing so, and noting the form of <C>(^) given by (73.13), we obtain 

. Sirhn 


l^{E-7{rmr)m = 0. 


And noting the property of ©(0) given by (73.20), this becomes 




AS 


-{^?-7(r)}J2(r) = 0. 


By substitution of (73,23), this then gives the desired result 

as an equation for determining the form of the remaining function 



§73 


PARTICLE IN A CENTRAL PIELD 


399 


B{r), in states of energy E, and angular momenta M. = nih’^ir and 
= Z(Z-J-l)A/2ir. Writing 


X{r) = rR{r), 

this result can also be expressed in the simpler form 

A® Z(Z+1) 



877 % f 

dr^ 

“ [ 


E. 


^irhn 


F(r)jx(r) = 0, 


(73.34) 

(73.35) 


which is similar to that for a one-dimensional problem and may be 
treated by the approximate methods of § 69 (6). 

The actual solutions of this equation will depend on the form of F(r); 
and the conditions on E necessary to secure an allowable form for use 
as a constituent of the eigenfunction u{r,d,<f>) will determine the 
spectrum of energy eigenvalues E. The energy levels will, in general, 
be degenerate since more than one value of the quantum number m 
can correspond to any given value of Z except for the case of Z = 0. 
As a special condition on the allowability of eigensolutions it should be 
stated that x(^“) = rJ?(r) has to go to zero at r = 0, since otherwise it 
can be shown that the Hermitian character of the Hamiltonian would 
be lost at this singular point.f Having found a satisfactory solution 
for B{r), we then have all that is necessary for a knowledge of the 
eigenfunction 

uir, e, 4) = GPf {cos 0)e*'»^J?(r) = C'Pp(cos (73.36) 


which corresponds to specified possible values of E, 3L, and It 
will be noted, in agreement with earlier discussions, that these eigen- 
solutions for the three-dimensional problem may be regarded, on going 
to high quantum numbers, as each associated with an extension of 
magnitude in the classical phase space. 


74. Two interacting particles 

In the foregoing applications we have considered the behaviour of 
a single particle in a steady potential field. Let us now consider the 
somewhat more complicated ease of two rum-identical interacting 
particles of masses and mj. This will provide methods useful in 
treating the two-particle problems presented by the hydrogen atom, by 
the diatomic molecule, and in the theory of collisions and scattering. 
We may again adopt the method of studying steady state solutions 
since any solution could then be obtained by superposition. 

The energy eigensolutions u{q) corresponding to the steady states of 

t See Pauli, JSandbtu^ der Ph^aik, xxiv/1, second edition, pp. 123-4. 



300 SOME SEflPIiB APPLICATIOiTS OF QUAOTXJM MECHANICS Chap.Vni 


our present system will now be a function of the Cartesian coordinates 
Vv *1 8-^*^ ® 2 > ^ 2 * *2 ®8'Ch of the two particles, and the equation 

delermining them can evidently be written in the form 


dyyezij' 




■V)u = 0, (74.1) 


where and are the masses of the two particles, E is the total 
energy of the combined system, and V is the potential regarded as 
a function of the coordinates of the two particles. 

(a) Separation into external and internal equations. If we now assume 
that the potential V can be regarded as the sum of two terms and 
dependent on the coordinates for the centre of gravity of the 
two particles and the other on the coordinate differences between 
the two particles, it proves possible to separate the above equation into 
two equivalent equations depending separately on the coordinates of 
the centre of gravity of the system and on the coordinate differences. 
To accomplish this we may introduce the new variables X, Y, Z, 


mi+mg ’ 


(74.2) 


y _ nhZi+m^Zj 

which correspond classically to the position of the centre of gravity of 
the system as a whole, and x, y, z. 


X — * 2 — 0 : 1 , 

y = y%-y\, ( 74 . 3 ) 

z — 

which correspond classically to the differences between the coordinates 
of the two particles. 

On making the indicated substitutions, equation (74.1) is found to 
assume the form 

^ a.iLjLiL\ j_ 

+-^{JSr-T^(^,y,2)-l^t(s:,y,s)}tt = 0, (74.4) 

where we have now explicitly indicated that the potential F is assumed 
expressible as a sum of two parts depending on the external and internal 



§74 


TWO INTERACTING PARTICLES 


301 


coordinates, and the symbol fi has been introduced for the so-called 
reduced mass of the system 


/i = 




(74.5) 


We readily see, however, that equation (74.4) will be satisfied by 
a solution of the form 


u[X,7, Z, X, y, z) = Meit {X, F, Z)«tot («, V, s). (74.6) 

provided the new functions and are taken as solutions of the 
separate equations 


1 I b^ \ 8 - 37 ^ 

(^ert-^Wext = 0 (74.7) 


and 


U— 


+ 


dy^'^Bzy 


Sjt® 

'“jnt+'TF(^~-®^ext~^t)“lnt = (74.8) 


where is a constant. This is evidently satisfactory since by multi- 
plying the first of these equations by and the second by and 
adding, we reproduce the original equation (74.4) with the suggested 
form of solution (74.6). 

It will be noted that the first of the two equations is the same as 
that for a single particle with a total mass equal to the sum 

of the masses of the two particles, with the total energy and 
moving in a potential field T^(X, F, Z). This equation can hence be 
treated by the same methods as for a single particle, and the allowable 
values of E^ can give a continuous or discrete spectrum according to 
the nature of the external field Krt- 

(6) Separation of internal equation in case of central forces. The 
second of the above equations (74.8) can also be given a simple treat- 
ment in case the internal potential is assumed to be solely a fimction 
of the classical expression for the distance between the two particles 
r = treat the problem under these circumstances we 

may first transform to polar coordinates r, 6, and ^ in accordance with 
the equations a; = r sin 0 cos 

y = rsinfism^, (74.9) 

z = reosfi. 


Classically ttiia corresponds to taking polar coordinates for the second 
particle with the first particle at the origin. Introducing these new 



303 SOME SIMPLE APPLICATIONS OF QUANTUM MECHANICS Chap.Vm 


coordinates into (74.8), we obtain 
1 8, 


i(’'"^)+r2sin8 8^'^r^am?e 


-^{E^t-^nt(r)}u^t = 0, (74.10) 


where we have explicitly indicated the assumed dependence of 
r alone, and have introduced the symbol in accordance with 

= (74.11) 

It will be immediately noticed that equation (74.10) for our two 
particles has the same form as our earlier equation (73.31) for a single 
particle in a spherically symmetrical field, with the mass of the 
single particle m replaced by the reduced mass /* for the two particles. 
This will make it possible to make use of results obtained in the pre- 
ceding secroon. To do this we may treat equation (74.10) by the method 
of separation of variables by assuming a solution of the form 

«lnt(^ 4 >) = (74-12) 

where we take O, 0, and R as solutions of the individual equations 


d<f,^ 


= 0 , 


(74.13) 


1 d 
amBdB 



+i30 


. ain^d 


0 = 


0 , 


(74.14) 


with p and as constants. This separation is evidently satisfactory 
since the original equation (74.10) will evidently be reproduced with 
the suggested form of solution (74.12), if we multiply the above three 
equations by .B0/r*sin®0, and 0® respectively and then add. 

(e) Solutions of the separate equations. The three separate equations 
which we have thus obtained are closely related to equations which we 
have already encountered in the preceding section, owing to the simi- 
larity in form of (74.10) with (73.31) mentioned above. 

As a satisfactory solution of the first of the above equations we may 
evidently take, in agreement with (73.13) and (73.14), 

0(^) = 


»» = 0,±1,±2,±3 


with 


(74.16) 

(74.17) 



303 


§74 


TWO INTERACTING PARTICLES 


and shall have in agreement with (73.15) 


= 7n 



(74.18) 


as the eigenvalues for the s-component of the angular momentum of 
the pair of particles. 

As a satisfactory solution of the second of the above equations 
(74.14) we may evidently take, in agreement with (73.21) and (73.22), 

0(0) = CPf{co&6) (74.19) 

with I = \m\, Iml+l, (74.20) 

and shall have, in agreement with (73.23), 

as the allowed eigenvalues for the total angular momentum squared. 
The normalizing factors for (74.16) and (74.19) are, of course, the same 
as already given by (73.16) and (73.26). 

FinaJly, substituting the value for )3 given by (74.21) into (74.15), we 
can rewrite the third of our separated equations in the form 


where for simplicity we have now dropped the subscripts from 
and TLf Putting 

(74.23) 


J2(r) = 2dL\ 
r 


this can be more simply written as 

A® r ^ r 


(74.24) 


which is similar in form to that for a one-dimensional problem and may 
be treated by the approximate methods discussed in § 69 (5). 

The actual solutions of (74.22) and (74.24) which will be obtained 
will depend, of course, on the form of the potential F(r). Por the solu- 
tions to represent a permanently associated pair of particles they may 
be normalized to give 

00 CO 

J |P|V® dr = J 1x1® dr = 1. (74.25) 

0 0 

And for the study of collisions they can be normalized to represent the 



304 SOME SniPLE APPLICATIOXS OF QUANTUM MECHANICS Chap. VIII 


desired probability current. For the solutions to be allowable it is 
necessary for = r-B(r) to go to zero at r = 0, as already mentioned 
in connexion with (73.35). 

Having obtained a suitable solution, we may then write the complete 
eigensolution of (74.10), corresponding to a steady state of a system of 
two particles, in the form 

«(r, = C If (cos = C If (cos ^ . (74.26) 


(d) Indicated nature of applications. We may now indicate the nature 
of some of the applications which can be made of this result. The 
possible applications differ in accordance with the assumptions made 
as to the form of the potential F(r). 

If we assume that the potential corresponds to the Coulomb attrac- 
tion of an electron of charge — e and the core of an atom of effective 
charge Ze, the potential would be of the form 


F(r) = 



(74.27) 


Substituting this value into (74.22), it is formd possible to obtain exact 
eigensolutions of that equation. In the case E <.0, corresponding to a 
bound electron, the eigensolutions can be expressed in terms of the 
so-called associated Lagueire polynomials of degree (n—l— 1) and order 
(2Z-f 1), where » is a new so-caUed total qmnium number', and the corre- 
sponding allowed eigenvalues of are given by 




27r2/iW 

n%^ 


(74.28) 


With % = Z-f-3,... • 

This result is in agreement with the known energy levels of the hydro- 
gen atom when the fine structure, due to spin and relativistic effects, 
is neglected. In the case ^7 > 0, corresponding to a free electron, the 
eigensolutions can be expressed in terms of hypergeometric functions 
and the energy levels give a continuous spectrum. 

A second possibility of applying the foregoing results is presented by 
the rotation and vibration of diatomic molecules, provided that the 
two atoms are of a character to permit their idealization as simple 
particles without spin. Under these circumstances we can use equation 
(74,24) in a very simple manner to obtain a zero-order approximation 
for the lower energy levels of such a system. We first assume that the 
potential function can be represented with sufficient approximation by 



§74 


TWO INTERACTING PARTICLES 


305 


taking a Hooke’s law of force between the atoms in the neighbourhood 
of some equilibrium distance and writing 

V = k{r—r^^ = TcxK (74.29) 

Returning to equation (74.24), we then replace by the constant value 
r| in the denominator of the last term, since x b® small and F(r) 
large except in the neighbourhood of r == r*. Substituting x for r— r^, 
equation (74.24) can then be written as 

which we recognize by comparison with (72.2) to be that for a linear 
harmonic oscillator with E — Z(Z+l)AV®^V^e the parameter whose 
eigenvalues must now be chosen so as to secure acceptable eigen- 
solutions for x(^)* accordance with (72.3), we can then write for the 
energy eigenvalues of such a system 


^ru = 




(74.31) 


or, using more nearly the usual notation of band spectroseopists, 


^vJ — 


(74.32) 


where v and J are quantum numbers that take the values 0, 1,2, 3,..., 
and the constants Vg and may be called the equilibrium values for 
vibration frequency and moment of inertia. This gives a good representa- 
tion of non-electronic energy levels for diatomic molecules in singlet 
S states, i.e. those for which the model is appropriate, and forms a 
satisfactory basis for higher order approximations that can be made. 
Noting the relation of Z to m given by (74.20), we see for any given 
value of the quantum number J — in our present notation — ^that the 
quantum number m which would specify the angular momentum of 
the molecule around a given axis could assume the values 

m = 0, ±1, ±2,..., • (74.33) 

Hence the multiplicity or quantum weight of a level specified by v and 
J would be given by = 2 J+1. (74.34) 

Further possibilities of applying (74.26) arise in the study of collisions 
between particles, which can be undertaken by setting up a steady 
state solution to represent the colliding and scattered particle (i.e. an 
incoming plane wave and an outgoing ^herical wave). For a treatment 
of these problems see, for instance. The Theory of AUmic CcUisians, 
by Mott and Massey (Oxford, 1933). 



306 SOME SIMPLE APPLICATIONS OF QUANTUM MECHANICS Chap.Vin 

75. Particles with spin 

(a) The spin variables and operators. A complete and satisfactory 
account of all the electronic energy levels, found in the analysis of 
atomic spectra, has only been made possible by ascribing to the electron, 
as originally proposed by Uhlenbeck and Goudsmit, in addition to the 
properties of position and momentum -which correspond to the classical 
observables x, y, z and Py, p^, a further property called dectron spin, 
which corresponds to the appearance of certain new non-classical ob- 
servables Sj., Sy, Sg called the components of spin with reference to the 
axes indicated. This new property of electron spin has the general 
nature of an intrinsic angular momentum, together with an associated 
magnetic moment, which must be ascribed to the electron as permanent 
attributes without reference to its orbital motion, furthermore, it has 
now been found that similar spin properties must be assigned to all 
the fundamental material particles, electrons, positrons, protons, and 
neutrons whose existence at present seems necessary. 

from Ihe analysis of spectral and other data it is found that the 
component of intrinsic angular momentum parallel to any given axis 
can only exhibit the eigenvalues in contrast -with the integral 

multiples of the Bohr xmit of angular momentum found in cases 
of ordinary rotation as studied in ^ 73 and 74. for the magnitude of 
the total intrinsic angular momentum we then have 

V3W4ff) = ^{i{l+i)}hl27r. 

The ratio of intrinsic magnetic moment to intrinsic angular momen- 
tum is found to be different for -the different fundamental particles, 
for the electron the ratio is —ejme, where —e is the charge and m is 
the mass of the electron.. It is interesting, although perhaps not very 
fundamental, to note that this is the ratio which would be calculated 
classically for a ^here of mass m and uniform density revolving on its 
axis and carrying a sur&ce charge — e. for the positron the absolute 
magnitude of the ratio is presumably the same as for the electron, for 
the proton the ratio is about 2-7e/J!fc, where AT is -the mass of the 
proton, for the neutron the ratio appears to be of the same order as 
for the proton. 

Although from a classical point of view a charged particle revolving 
around its o-wn axis should exhibit an intrinsic an gular momentum -with 
an associated magnetic moment, it is not profitable for two reasons to 
try in de-tad -to -treat the spin of a particle as due -to this kind of revolu- 
-tional motion. In the first place, if the phenomenon co-uld be handled 



§75 


SPIN 


307 


by the methods of §73 as a simple example of axial revolution, we 
should expect to find all integral multiples of A/27r as the allowed eigen- 
values of intrinsic angular momentum around any selected axis, while 
the actual analysis of spectral energy levels has shown that -rA/47r and 
— h/4:7r are the only eigenvalues which really do occur. In the second 
place, if the spin could be regarded as a revolution, we should have to 
treat not only the components of intrinsic angular momentum but also 
the conjugate angular variables for the axial orientation of the electron 
as observable quantities. It is immediately evident, however, that the 
orientation of the electron around its own axes would presumably defy 
observation; and observable quantities corresponding to such orienta- 
tion do not appear either in the original non-relativistic Paulif theory 
of the spin which we shall consider, nor in the later more fundamental 
relativistic treatment of the electron given by Dirac.J 
We are now ready to undertake the theoretical treatment of this new 
phenomenon. Corresponding to the fact that the only eigenvalues for 
the components of intrinsic angular momentum have half the magni- 
tude of the Bohr unit of angular momentum A/ 27 r, it is customary to 
introduce three new observables, the so-called spin variables 8^, Sy, and 
Sg, which have as their only eigenvalues ±i- If one of these observables 
exhibits a definite eigenvalue, say, for example, +i, this then 
means that the particle is in a state such that the component of intrinsic 
angular momentum in the positive a;-direction would exhibit the precise 
value i{h/27T). Since this amount of angular momentum would go to 
zero as 7i goes to zero, it has no classical analogue, and we regard the 
new observables 5^, Sy, and Sg as ?Mm-c!!a555i<xrf.!i 

In addition to these new observables we may also introduce three 
new operators, s^., s^, and s^, which are regarded as correlated with the 
corresponding observables ^d Sg in the usual quantum mechani- 
cal manner; and we must now consider the properties of these operators. 
For the commutation rules applying to the new operators we takeff 

SySg—SgSy = iSjj, 

s.Sj,— Sj.Ss = iSy, ( 75 . 1 ) 

t Pauli, Zeits,f. Phys, 43, 601 (1927). 

J Dirac, The Principles of Quantum Mechanics^ second edition, Oxford, 1935, 
chapter xii. 

II As emphasized by Bohr, Atomtheorie und Naturbesdireibuny, Berlin, 1931, observa- 
tions of sufficient accuracy to give a direct separation between the intrinsic cbngalar 
momentum of an electron and that due to its orbital motion cannot be made, 
tt See foregoing reference to Pauli. 



308 SOME SIMPLE APPLICATIONS OF QUANTUM MECHANICS Chap.VIH 

in analogy with the commutation rules (73.7) applying to the com- 
ponents of ordinary angular momentum. And, in accordance with the 
fact that the squares of the new observables have -}-J as their only 
eigenvalue, we also take 

s| = s® = s| = i ^ (76.2) 

This agrees with the consideration that the characteristic equation 
s|^ = applying, for example, to the eigenvalues and eigenfunctions 
of the observable should necessarily reduce to the form s|^ = 

The commutation rules given by (75.1) can be expressed in another 
form with the help of some straightforward algebra.f We first obtain 
the preliminary result 

i(s^s„-fSj,Sj.) = (isJSj,-}-Sy(iSj,) 

= (SyS^— Sj,Sj,)Sy-[-Sy(s„Sa— S^Sy) 

= — s^sj-j-s®s* 

= -K+K 

= 0 , 

where the second fonn of writing comes from the first of equations 

(75.1) , and the next to the last form from (75.2). This result, which 
can also be expressed in the form 

Sa,Sj, == — SyS^, (76.3) 

can be described by saying that the two operators aniicommvie. Sub- 
stituting (75.3), together with the two further relations given by sym- 
metry considerations, into (75.1), we then obtain 

^ “SySjJ = 

= -S,Sy = ^ISy, (76.4) 

SyS^ = — SySy = JfSj, 

as the desired alternative expression for the commutation relations 

(75.1) . 

Since the three new operators which we have introduced do not 
commute with each other, we can determine only one of the observables 
Sx, «y, and Sy at any given time. On the other hand, since the new 
operators are assumed to commute with those for position, we can 
simultaneously determine the values of x, y, z and any one of the new 
observables that we desire. By convention this latter is usustUy chosen 
as «y for the component of spin parallel to the »-axis. Malring this 
choice, we may then introduce — in the spirit of our previous develop- 
t See Dirac, The PriruApUe oj Quantum Meehanica, second edition, Oxford, 1936, § 19. 



§75 


SPIJf 


309 


ment of the quantum, mechanics — probability amplitudes, which can. 
be written in the forms 


^{ar, y, z, a.; t^) = «(ar, y, z, s.) = v.{q, a.), (75.5) 

for specifying, at any selected time the state of our system in the 
Vi 2, language. These amplitudes will be functions of the continuous 
variables of position x,y,z and of the two discrete values of a^ = 
which are possible. For the actual probability of finding the particle 
in a specified range ixdydz with a specified value of a^, we shall have 

W (x, y, z, a.) dxdydz = u*{q, s.)u{q, aj dxdydz. (75.6) 

And for the total probability of finding the particle somewhere and 
with one or the other of the two possible values of the spin variable, 
we shall have the summation of two terms: 


2 jj j u*(q,8;,)u(q,s.) dxdydz = 1. (75.7) 

We must next inquire into the action of our operators s^., Sj,, on 
probability amplitudes of the above kind. In accordance with the 
spirit of our previous development of the quantum mechanics (see 
§ 67 (6)), we shall assume that each operator Sj^ can be correlated in any 
selected language with an appropriate Hermitian matrix, and the action 
of the operator then represented by an expression of the general form 


S,^u(q,m) = 2[Sifc]mn®fe«) 

n 

m,n= 


(75.8) 


where [«*;]„,„ is an element of the Hermitian matrix of two rows and 
two columns which corresponds in the language employed to the opera- 
tor Sfc (fc = x,y,z). 

We may now investigate the specific form of these matrices in the 
language. The matrix correlated with s* will be specially simple in 
this language, since the action of the operator should then reduce 
merely to multiplication by Sg itself. This leads at once to the form 

which does give us, on substitution into (75.8), the desired results 

Turning next to the matrix corresponding to the operator s^., we cam 
apply the general principle (see (67.20)) that matrices will satisfy the 



3X0 SOME SBIPLE APPLICATIONS OF QUANTUM IVIECHANICS Chap. VIII 


same algebraic relations as the operators to which they correspond pro- 
vided matrix multiplication is used in place of operator multiplication. 
The second of equations (75.4) can then be written in the form 

n n 

which, by considering the cases m = l= ± J and making use of (75.9), 


gives us 


[^x]ti — — 0- 


Furthermore, writing the first of equations (75.2) in the form 


n 


we then obtain 

which, in accordance with the Hermitian character of the matrix, can 
only be satisfied by 

where ^ is an arbitrary phase. For example, the arbitrary phase can 
be taken as zero, which now gives us 

E«xU = (J J) (75.10) 

as the desired expression for the matrix in question. 

Turning finally to the matrix coix^ponding to the operator Sj,, we can 
re-express the second of equations (75.4) in the form 

which, with the help of (75.9) and (75.10), leads at once to 

( 75 . 11 ) 

as the expression for the remaining matrix. This then completes the 
apparatus necessary for the non-relativistic treatment of problems 
involving spin. 

(6) Applications involving spin. The most familiar necessity for 
application of the foregoing theoretical apparatus for spin arises in the 
study of electronic energy levels and their correlation with the data of 
atomic spectra. For this purpose several considerations must be taken 
into account. Firstly, in treating the total an gular momentum, and 
its conservation in fields of suitable s 3 mmetry, both spin and orbital 
momentum must be included. Secondly, in treating energy levels in- 
volving more than a single electron, only those solutions which are 
antis 3 nnmetric both in coordinate and spin variables are to be con- 



§75 


SPIN 


311 


sidered, as will be discussed more completely in § 76. And finally, the 
Hamiltonian for an electron must be modified by the inclusion of terms 
which correspond classically to the interaction of its magnetic moment 
with any imposed external magnetic field, and with the internal electro- 
magnetic field which it encounters in its orbital motion. 

Makiog due use of these several considerations, it then becomes 
possible to give a good account of electronic energy levels. For example, 
by adding to the Hamiltonian for the electron a term, corresponding 
to its magnetic energy in an external magnetic field Jf-y, of the 

form 

where ja is the appropriate constant corresponding to the electron’s 
intrinsic magnetic moment, it is possible to give a treatment of the 
Zeeman effect for the hydrogen atom. The suggested considerations 
also find their application in studying the energy levels of molecules as 
weU as atoms. Nevertheless, any detailed treatment of the effect of 
spin phenomena in the fields of atomic and molecular spectra would 
have to be too elaborate for inclusion here. 

For the considerations of statistical mechanics, the most important 
consequence of the existence of particle spin lies in the simple fact that 
we must now expect twice as many solutions of Schroedinger’s equation 
for a particle having this property as for a simple particle without it. 
If, for example, we have a steady state solution for a particle of the 

y,z,i;t) = u{x, y, z, 

where E is the energy of the particle and the spin variable has the 
specified value we must also expect a related steady state 

solution of similar fom, 

y, z, — t) = u'{x, y, z, — 

where the spin variable has its other possible eigenvalue And, 

indeed, in most cases of interest we may even expect u and «' to have 
the same dependence on the coordinates x, y, and z smd E and E' to 
be equal, since the interaction of the intrinsie magnetic moment of the 
particle with the surrounding electromagnetic field will either be absent, 
negligible, or at least on the average independent of the value of Sg. 

Such a simple doubling in the number of eigensolutions is of firequent 
importance for statistical mechanics and is easily taken into accormt. 
We have already called attention in § 71 (6) to the doubling that thus 
arises in the number of eigensolutions for a particle in a contauner. 



312 SOME srKrPT.E APPLICATIONS OP QtrANTIM MECHANICS Chap.Vm 


76. Systems of two or more like particles 

In § 74 we have already given a treatment to systems consisting of 
two dissimilar particles of masses Wj and and must now turn in 
the present section to some important new considerations that must be 
kept in mind in the quantum mechanical treatment of systems con- 
taining two or more entirely similar particles. These new considerations 
may be regarded as finding their philosophical justification in a combina- 
tion of the operational point of view, which would confine the formal- 
ism of physics to the treatment of conceivably possible observations, 
with the quantum mechanical conclusion as to theoretical limitations 
on observability, which may preclude the possibility of any knowledge 
as to the separate individual behaviour of two entirely similar particles. 

As an illustration, it is evident in the ease of collisions between 
entirely Himilflr particles that the Heisenberg imcertainty principle 
would make it theoretically impossible to follow the separate motions 
of the two particles throughout the course of a close encounter. Hence 
our formalism for treating such collisions should be devised so as to 
introduce no possibility of distinguishing as to which particle is which 
after such an encounter has taken place. And it has indeed been found, 
for example in the case of collisions of alpha particles with alpha 
particles, that correct conclusions can only be drawn when the formal- 
ism does allow for this consideration.! 

As another illustration, it is evident, in treating the steady states of 
a ^tem such as a box containing two identical particles, that the 
system could be maintained in its assumed steady state only if we 
re&ain from interfering with the constant value of its enei^ by makiTig 
observations on the whereabouts of the two particles. Hence the 
formalism for treating such steady states should be devised so as to be 
entirely symmetrical in the predictions which it makes as to the be- 
haviour of the two particles. And here also it has been formd, for 
example in studyii^ the steady states of a helium atom with its two 
entirely amilar electrons, that correct conclusions can only be drawn 
from a formalism which does provide S3nnmetrical predictions for the 
two particles4 

We may now turn to a mote detailed examination of the nature of 
the new considerations, beginning with the case of only two «iTni1a.r 
particles. 

t Se® Oppanbeimer, Phya. Reo. 32, 361 (1928}; and Mott, Proe. Roy. 8oe. A, 125, 
222 (1929): ifakL 126. 259 (1930). 

t See He benbw r g, Zeita.f, Phya. 39, 499 (1926). 



§76 


TWO LIKE PAKTICLES 


313 


(a) Symmetric and antisymmetric solutions for pairs of like particles. 
To study the ease of two similar particles let us start with Sehroe- 
dinger’s equation for the system 

and take 0 = t) (76.2) 

as some actual solution of this equation, where represents the ob- 
servables — ^say coordinates and spin — describing the first of the two 
particles, and go represents the observables describing the second of the 
two particles. 

Such an expression as (76.2) would not in general have to exhibit any 
symmetry, in the observables g^ and g^ for the two particles, merety in 
order to be a correct solution of equation (76.1). For illustration, let 
us consider a system constructed by introducing two simple similar 
particles into a box, representing for specificity the coordinates for 
the first particle put into the box and jg for ^he second particle so 
introduced. As a conceivable steady state solution of Schroedinger’s 
equation for this system we could then evidently write 

(76.3) 

where for simplicity we neglect the energy -of interaction between the 
two particles and take % and % as eigensolutions for a single particle 
in the box corresponding to the respective energies and JS?/. Such 
an expression would be a formally correct solution of Schroedinger’s 
equation for the system. It would, nevertheless, be quite unsymmetri- 
cal with respect to the coordinates g^ and jg for the two particles, and 
would permit the unsymmetrical assertion that the first particle has 
the energy and the second the energy Such an assertion, more- 
over, would be quite impossible of observational verification since all 
we could possibly hope to show would be that one of the two particles 
that we had put into the box — either the first or the second — ^had the 
energy and the other the energy JSJ. 

As a consequence of considerations such as the above, we now see 
that a fo rmally correct solution of Sehroedinger’s equation of the form 
(76.2) would not necessarily be an allowable solution, from a physical 
point of view, if the situation is such that we must demand symmetrical 
predictions for the two particles, nevertheless, starting with any cor- 
rect solution of Schroedinger’s equation for two similar particles, we 

3695.25 3 g 



314 SOME SIMPLE APPLICATION'S OF QUANTUM MECHANICS Chap.VIH 

Bhfl,n finH it easily possible to construct allowable solutions which do 
have the necessary symmetry properties. 

Owing to the fact that the two particles under consideration have by 
hypothesis entirely s i milar properties, it is evident that the Hamiltonian 
operator H for the system will depend in an entirely symmetrical 
manner on the observables and for the two particles. Hence, if 
is a correct solution of Sohroedinger’s equation, it is evident 
that — ^which differs therefrom only by an interchange in the 

occurrence of the observables for the two particles within the form of 
expression — ^would also be a correct solution. Furthermore, since both 
of the operators H and {hj2tri)8l8t occurring in the Schroedinger 
equation are linear, it is evident that any linear combination of 
these two solutions would also be a correct solution of Schroedinger’s 
equation. 

Two such linear combinations are of special interest. The first of 
these is the so-called symmetric solviion 

( 76 . 4 ) 

where 1/V2 proves to be the correct normalizing factor if the original 
solution normalized to unity. This solution is charac- 

terized by complete invariance to an interchange in the indices for the 
two particles, i.e. 

The second linear combination of interest is the so-oaUed antisymmetric 
soivtion, j 

( 76 . 6 ) 

It is characterized by a chaise in sign which occurs when the indices 
for the two particles are interchanged, i.e. 

== ( 76 . 7 ) 

As we shall immediately show, eith^ of these solutions has the desired 
property of giving symmetrical predictions as to the two particles. 

(6) Properties of the symmetric and antisymmetric solutions. We 
must now consider the properties of our symmetric and antisymmetric 
solutions in some detail. First of all we must show that they do give 
S 3 mimetrical predictions as to the two particles. 

We may begin by inquiring into the direct predictions which the two 
solutions would provide as to the probability of finding different pos- 
sible values for the observables and for the two particles. Assuming 



§76 


SYMMETRIC AND ANTISYMMETRIC SOLUTIONS 


315 


appropriate normalization, these predictions could be directlr obtained 
in the two cases with the help of the probability densities 

^{?l ?2 j 0 = ^*(?l ?25 ?2> 0 gx 

and = 

immediately see, however, since q^, /) is completely invariant 
to an interchange of indices for the two particles, and since ^a(?i? 2»0 
merely changes its sign on such an interchange, that the two probability 
densities are themselves invariant to any interchange in particle indices. 
Hence they would both provide predictions which would have the 
desired character of being the same for one of the two particles as for 
the other. 

For example, if we consider for simplicity a particle without spin 
and let and denote two infinitesimal ranges in which the 
coordinates and q^ for the particles might fall, we should compute 
eqml probabilities of the general form 

(76.9) 

for finding the first in and the second in and for finding the 
first in and the second in since the probability density 

W{q-^q 2 ^t) is itself invariant to an interchange of indices, both for the 
case of the symmetric and of the antisymmetric solution. We thus see, 
since our particles are not distinguishable on the basis of any intrinsic 
properties, that our formalism is really only adapted for calculating the 
total probability 

W(q^q^Md4tWt+^^iW^ (76.10) 

of finding one particle in the range and the other in the range 
Analogous calculations also show that the probability currents are sym- 
metrical in the two particles. 

We thus see for the case of two completely similar particles that the 
symmetric and the antisymmetric solutions both have the property of 
giving predictions which are entirely symmetrical with respect to the 
two particles and provide no means of distinguishing one of the particles 
from the other. We also see for the case of only two particles that these 
are the only forms of solution which would have this property. 

We must next emphasize the further important property of these 
solutions in that they always maintain their original particular sym- 
metry character as time proceeds. This is an evident consequence 
of their construction in the form firom individual 

solutions each of which has the time dependence demanded by the 



316 SOME SIMPLE APPLICATIONS OF QUANTUM MECHANICS Chap. VIII 

Sehroedinger equation. The necessity for the property can also he seen 
if we think of the application of Schroedinger’s equation 

^ (76.11) 

ct n 

to a direct calculation of the change with time in the total expression 
for Since the operator H is itself symmetrical in and q^, it is 
evident that the change in tji with time will have to have the same 
symmetry properties as tft itself. As a consequence of these considera- 
tions we now see that a pair of particles originally in a symmetric state 
would continue permanently in states having the sy mm etric property, 
and a parr originally in an antisjTnmetric state would always maintain 
the antisymmetric property. 

As to the symmetry character of the different kinds of particles that 
actually occur in nature we must of course rely on observation. It is 
found in the first place that aU the fundamental material particles — 
electrons, protons, and nevirons — occur only in antisymmetric states; and 
this may be a consequence of some underlying necessity of particle 
structure that we do not now understand. On the other hand, the 
non-material particles— pAofoTis — ^whioh can be introduced for the treat- 
ment of radiation are found to occur only in symmetric states. Further- 
more, the complex material nuclei and the atoms as a whole, when 
treated as simple particles, are found to occur only in symmetric or 
antisymmetric states according as they are constructed respectively 
from an even or an odd number of fundamental particles, nuclei being 
regarded as built only from protons and neutrons. Thus, for example, 
alpha particles and nitrogen nuclei when treated as individual particles 
exhibit symmetric properties even though the fundamental material 
particles are all antisymmetric. 

To complete our general discussion of symmetric and antisymmetric 
solutions we must also investigate the predictions which they allow in 
a situation where the separate behaviour of the two particles is at least 
approximately known, so that the two particles can be distinguished 
one from the other over a certain length of time. In such a situation 
we must, of course, still use either a symmetric or an antisymmetric 
solution, as the case may be, since the appropriate form of solution is 
determined for all time by the nature of the particles. Nevertheless, 
our predictions for the two particles must not now contradict the possi- 
bility that one particle could be known to be carrying out a behaviour 
quite different from that of the other. 



§ 76 SYMMETRIC AXD ANTISYMMETRIC SOLUTIONS 317 

To investigate such a situation let us consider two solutions of 
Schroedinger’s equation and ijfiiqJ) for a single particle in the 

potential field of interest, and let there be practically no overlapping 
of the solutions during the time of interest, ijsg. being nearly zero for 
those values of q where ijfi is large and vice versa. As possible symmetric 
and antisymmetric solutions for two particles we could then take 

0 = *)} (76.12) 

and q^, t) = 0 «^i(? 2 r O- 0 *(? 2 > t)h (76-13) 

where, on account of the failure of the solutions to overlap, we shall have 

t ) « 0?(?. «) « o (76.i4) 

when the index for either particle 1 or 2 is attached to the observables q. 

Noting (76.14), we then see that the probability density would 
assume the same form both for the sjrmmetric and the antisymmetric 
case, namely, 

^(?l?25 0 — ^(?1?2 j0 

= (76.16) 

This shows us that any differences in the laws of behaviour for 
particles having symmetric and antisymmetric properties do not be- 
come operative so long as the wave fimetions for the two particles do 
not overlap. 

Furthermore, if we now compute the probability of finding one 
particle in a range d^, where is appreciably different firom zero, and 
a second particle in a range dg®, where is appreciably different firom 
zero, we shall obtain both in the symmetric and the antisymmetric 
case, in accordance with the expression for such probabilities given by 
(76.10), 

(76.16) 

where two terms in the product disappear since is practically zero 
m the range dg® and ipi in the range dg^*’^ Examining (76.16), however, 
we now see that, so long the solutions do not overlap, we can rewrite 
our expression for the probability of finding one particle in dg<*J and 
the other in dg® in the form 

Tr(g<«g®, t) dg<«d# = i^i.<g<«, t) I*l^Xg®, t) Pdg®>dg®, (76.17) 



318 SOME SDDPLB APPLICATIONS OP QUANTUM MECHANICS Chap.Vin 

where the superscript {k) now denotes that particle whose behaviour 
follows the solution and (1) the particle following the solution 
This then satisfactorily demonstrates, in accordance with the spirit 
of the correspondence principle, that our formalism is such that two 
particles whose intrinsic properties are identical can nevertheless be 
distinguished from each other by their behaviour in the classical limit 
where quantum mechanical uncertainties connected with the over- 
lapping of probability amplitudes can be neglected. Incidentally it also 
shows that our choice of l/\2 as the normalizing factor in (76.4) and 
(76.6) was the correct one to make. 

(c) Treatment of more than two like particles. We must now make 
a brief statement concerning the treatment of more than two particles 
having identical intrinsic properties. Here again we shall wish to de- 
mand that the allowed solutions of the Schroedinger equation should be 
such as to lead to symmetrical predictions for the different particles 
involved in a situation where quantum mechanical limitations on ob- 
servability would prevent the maintenance of any knowledge as to 
which particle was which. It is evident that such a demand would 
continue to be satisfied with more than two particles either by a sym- 
metric solution ••• ?») t) for n particles, having the property 

PW?1?2?S •••?»>«) = (76.18) 

where the application of the operator P represents any permutation 
of particle indices, or by an antisymmetric solution ^o(?i? 2?8 — ?»>*) 
having the property 

(76.19) 

where we have the negative or positive sign according as the pemauta- 
tion is odd or even. In either case the probability densities would 
evidently be invariant to any interchange in particle indices. 

As soon as we go to three or more particleSj however, it also proves 
possible to find further types of solution, having more complicated 
symmetry characters than the above, which on appropriate combina- 
tion also lead to probability densities which are invariant to a per- 
mutation of particle indices. It seems difficult to eliminate these other 
possibilities on a purely theoretical basis. Nevertheless, if they existed 
we should sometimes find a pair of particles of a given kind which 
exhibited none but antisymmetric solutions, and at other times a pair 
of the same kind of particles which exhibited none but symmetric solu- 
tions. For example, a helium atom with the ordinary antisymmetric 
solution, corresponding to two electrons in the Z'-shell with their spins 



§76 


MORE 'THAN TWO LIKE PARTICLES 


319 


antiparallel, could appear, after collision with a third electron, with 
two electrons in the jK^-sheU with parallel spins, and this new atom 
would then exhibit none but sjTnmetric states. Hence solutions other 
than the symmetric and the antisymmetric could lead to the unexpected 
result that all pairs of particles would not have to have identical pro- 
perties even though constructed from single particles having identical 
properties. 

We must, of course, ultimately appeal to observation to determine 
the symmetry character of the solutions that do correspond to natural 
phenomena. So far no evidence has been found for solutions other than 
the completely symmetric or completely antisymmetric. With any 
number of similar particles in the system, the different kinds of particle 
actually occurring in nature faU either into the sjunmetric or anti- 
symmetric class as already described above in the case of pairs of 
particles. 

If is formally correct solution of Schroedinger’s 

equation for a system of n particles, which has been normalized to 
unity, the corresponding allowable symmetric and antisymmetric solu- 
tions would be given by 


and 


p 

p 


( 76 . 20 ) 


where l/Vn! is the proper normalizing factor, the operator P is regarded 
as permuting ilie different particle indices 1, n, and we take a sum- 
mation over all possible independent permutations including that of 
identity. In the S3munetric case the »! terms are all taken with the 
positive sign, and in the antisymmetric case with n^ative or positive 
sign according as the jjermutation is odd or even. 

(d) Further properties of symmetric and antisymmetric solutions — 
case of small interaction between particles. Pauli exclusion principle. 
In case we have a system of n similar particles at such a dilution that 
the energy of interaction between the particles is small, it is oAen 
advantageous to treat the n particles as being distributed amorrg the 
various unperturbed energy eigensolutions Uji.{q), u^q), itmiq), etc., which 
would be found for a sin^e particle, using an unperturbed Hamiltonian 
for the system that neglects interaction. At any instant we could 
thmx regard the state of the system as describable with the help of a 
superposition of products of the general form ii;ciqiyujliqi)u„{q ^) ... «r(2»)* 



320 SOME SIMPLE APPLICATIONS OF QUANTUM MECHANICS Chap.Vni 


In case the partides belong to the symmetric class, we may make our 
superpositions by first combining products such as the above to give 
symmetric eigenfunctions of the form 

?2 ?3 - ?») = 2 P%(?l)“i(?2)«m(9'3) - «/•(?«). (76.21) 

P 

where a summation is taken over all the n\ permutations of the particle 
indices. Such eigenfunctions would evidently themsdves be symmetric, 

P%i?2?3-3n)= %1?2?3 •••?«) (76.22) 

for any interchange of indices. Their linear superposition, when multi- 
plied by suitable coefficients which can incidentally take care 

of normalization, would then let us express any instantaneous state of 
the system, and by allowing these coefficients to change with the time, 
as in the method of the variation of constants (see § 68), we could 
express any solution of Schroedinger’s equation for our system of sym- 
metric particles in its full time dependence. 

In case the particles belong to the antisymmetric class, we may 
first combine products such as «fc(yi)«i(? 2 )ttm(? 3 ) — “r(?n) anti- 

symmetric eigenfunctions of the form 


^.(?lff2?3-ffn) = 2 (^l)P%(?l)«/(?2)«m(?3) •••«/(?»). (76.23) 

p 

where the negative or positive sign is to be used according as the per- 
mutation is odd or even. Such eigenfunctions wotdd evidently them- 
selves be antisymmetric, i.e. 


P£^a(?l?2?3-?n)= (76.24) 

and they could be superposed in the manner described above to express 
any solution of Schroedinger’s equation for a system of antasTmmetric 
particles. It is sometimes convenient to note that an antisymmetric 
eigenfunction of the form (76.23) could also be expressed as the deter- 
minant 




i «fc(?i) 
%(?2) 
«fc(?8) 


«j(?2) 

«j(?3) 


«m(?l) 

«m(?2) 

«m{?8) 


«V(?l) 

«V(22) 

«V(?8) 


(76.25) 


!«*(?») 

This form of writing makes it very easy to see that these antisym- 
metric eigenfunctions would change sign for any single interchange 
of a pair of particle indices, since this would be equivalent to the inter- 
change of two rows of the determinant. 



§76 


FURTHER SYMMETRY PROPERTIES 


321 


In setting up the products «it(?i)«K? 2 )“TO{? 3 ) — «r(?n) ^ “ 

constructing a symmetric eigenfunction with the help of (76.21), it is 
evident that the same individual solution, for example «*(}) = «/{?), 
might occur more than once so that a number of paidicles could be in 
the same ‘unperturbed quantum state’. In constructing an aniisym- 
metric eigenfunction, however, with the help of (76.23) or (76.25), it 
is immediately evident that any eigenfunction ...?„) would auto- 
matically be equal to zero if the same individual solution, for example 
u'kiq), should occur more than once in the original product, since this 
would make two columns identical in the determinant (76.25). Hence 
for the antisymmetric case no two particles could ever be found in the 
same ‘unperturbed quantum state’. 

This result, which holds for all the fundamental material particles — 
electrons, protons, and neutrons — ^may be r^arded as a generalization 
of the original Pauh exclusion principle describing the fact that we 
never find more than one electron in an atom with a given set of values 
for all four quantum numbers, including that for the spin, or, in other 
words, neglecting the direction of spin, never more than a pair of 
electrons in any single atomic orbit. The great importance of this 
principle in explaining the facts of spectroscopy and chemisbry is well 
known. 

It will be seen not only that the antisymmetric property of solutions 
implies the exclusion principle, but, vice versa, that the exclusion 
principle can only be satisfied by antisymmetric solutions since only 
these have the property of vanishing with two particles in the same 
‘quantum state’. It is also interesting to note in the case of anti- 
symmetric solutions that two similar particles with the same direction 
of spin could never occupy the same position in space, since the proba- 
bility amplitude describing the situation would then have 

to be equal both to ±^a(? 2 ?i,i) which can only be satisfied by the 
value zero. 

It may be remarked that the exclusion principle does not have to be 
interpreted as meaning that all the electrons in the world ‘must keep 
in a mysterious communication in order not to get into eymmetrio 
states’, since we have already seen in connexion with equation (76.15) 
that the differences between symmetric and antisymmetric solutions 
will automatically only come into play when the wave functions for 
the different electrons under consideration overlap. 

(e) Enumeration of eigensolutions. We may now conclude this long 
treatment of systems containing similar particles by emphasizing some 

358S.S5 X ii 



322 SOME SIMPLE APPLICATIONS OP QUANTUM MECHANICS Chap. VIII 

considerations which will be of special importance in our later statisti- 
cal applications. In Tnaldrig these applications to systems of similar 
particles we shall find it convenient to regard the state of the system 
at any time as expressed by a superposition of eigenfunctions of the 
kind discussed above, and by considering the behaviour of ensembles 
of such systems we shall be led to assign equal a priori probabilities 
to the different eigenstates that correspond to these eigenfunctions. 
Hence, in calculating the probability of finding the system in any given 
specified condition, we shall wish to count the number of eigenstates 
that correspond to the specified condition. In carrying out such enu- 
meration we encounter two important differences between quantum 
statistics and classical statistics. 

In the first place, since a system of similar particles can never change 
from states of one symmetry class to those of another, but depending 
on the kind of particle must remain permanently in states which are 
either symmetric or antisymmetric, it is evident that our enumeration 
of the number of eigenstates corr^ponding to a specified condition of 
the system must include only those states which have the appropriate 
symmetry. This quantum mechanical limitation to a particular class 
of accessible has no immediate analogy in the classical statistics. 

In the second place, in counting the number of eigenstates that corre- 
spond to a Reified steady condition of the assemblage, the question 
arises as to whether a new state will be obtained by the interchange 
of similar particles between the individual eigenfunctions describing 
them; for example, by the change of into %(g2)%(?i)- !“• 

accordance with all of the foregoing discussion, it is evident, however, 
that such an interchange does not lead to a new eigenstate. In the case 
of symmetric solutions, the interchange would not even have any 
effect on the eigenfunctions; and in the case of antisymmetric ones 
it would only change the sign of the eigenfunctions, which, as we have 
seen above, has no physical meaning or consequences. This also is 
different from the classical situation where the interchange of «iTriiln.r 
particles between different individual states was regarded as leading to 
a new state of the system as a whole. 

Some additional remarks concerning this latter difference between 
classical and quantum statistics will be darij^dng. In the classical 
mechanics, the procedure of taking the interchange of HimilRr particles 
between different individual states as leading to a new state of the 

t The usefiil term ‘aooeseible’ was first employed by B. H. Fowler, Statieticed 
JUtehanies, Cambridge University Bcees, 1920. 



§76 


EXUMEBATIOX OF EIGEXSOLUTIOIfS 


323 


system as a whole was justified, not because the particles were thoi^ht 
of as carrying labels which would distinguish between them, but because 
it would in principle be possible to think of an observer who could 
follow the motions of the individual particles — without disturbing them 
— ^and actually determine whether two similar particles had inter- 
changed roles or not. In the quantum mechanics, however, the possi- 
bility of following the behaviour of the individual particles is limited 
by the Heisenberg uncertainty principle, and this can make it impos- 
sible to determine whether such an interchange has taken place. More- 
over, such a limitation is regarded in the quantum mechanics, not as 
an accident due to an unsatisfactory choice of the tools of observation, 
but as a limitation in principle which can make the question of such 
interchange meaningless. As a result, in the quantum statistics we do 
not count as different two states which cannot at least by conceivable 
observations be distinguished one firom the other. 

The forgoing remarks will also make it possible to differentiate 
between cases where it is necessary to keep the above considerations 
in mind in coimting eigenstates and cases where the enumeration does 
not involve any of these special alterations in principle arising fi»m the 
quantum mechanics. In a general way it may be said that the situations 
actually met with in counting eigenstates fall into three classes. 

In the first class we meet systems of interacting particles having 
symmetric properties. In determining the probability of an equilibrium 
condition of such a system we must then coimt none but symmetric 
states and realize that a mere interchange in particles does not lead to 
a new eigenstate. In the second class we have systems of similar 
particles having antisymmetric properties, and must include only anti- 
symmetric states without allowance for particle interchange in our 
enumerations. Finally, in the third class of situations, the significance 
of which is sometimes overlooked, we meet systems composed of ele- 
ments, having similar properties, which are nevertheless distinguishable 
in principle the one element from another. For example, in treatii^ 
the specific heat of solids it is convenient to take the various possible 
modes of elastic vibration as the similar elements to be considered, and 
these would be distinguishable one firom the oth^, even with the same 
intrinsic firequeney of vibration, by spatial location or orientation. Such 
possibilities of distinguishing one element from another also arises in 
connexion with the modes of electromagnetic vibration in a hollow 
enclosure, in connexion with isotopic particles, and in connexion with 
intrinsically similar particles which are kept in separate containers and 



324 SOME SIMPLE APPLICATIONS OF QUANTUM MECHANICS Chap.Vin 

not allowed to interact. In such cases, even though the individual 
eigensolutions %(gi), etc., for the different elements may he the 
same in functional form, it is evident that we must still count it as 
leading to a new eigenstate when we interchange the elements among 
such individual solutions. 

As we shall later see, the above three classes of situation correspond 
respectively to the occurrence in the quantum mechanics of the so- 
called Bose-Einstem, Eermi-Dirac, and Maxwell-Boltzmann types of 
statistical situation. 



IX 

STATISTICAL ENSEjMBLES IX THE QUAXTOI 3^IECHAXICS 

77. Introduction of statistical methods in the classical and in 
the quantum mechanics 

We are now ready to undertake our investigation of statistical 
quantum mechanics. We may begin by making some remarks on the 
reasons for introducing statistical methods into classical and into 
quantum mechanical considerations, and on a notation which we shall 
find usefol in the quantum statistical mechanics. 

In the classical mechanics we can regard the state of a system at 
any time as specified by the values of its coordinates and momenta, 
q and p, and can regard the future behaviour of the system as uniquely 
determined by its state at some initial time. Nevertheless, in spite of 
this theoretical possibility of exact predictions as to behaviour, we often 
encounter situations in the classical mechanics such that a precise 
treatment of behaviour with time would not be appropriate. Such 
situations arise, even in the treatment of simple mechanical systems, 
when our knowledge of the initial state is too inexact to justify its 
idealization as some particular precise state, and arise quite commonly 
in the treatment of comphcated mechanical systems, both because our 
knowledge of initial state is then made necessarily inexact by empirical 
difficulties, and because the calculation of precise behaviour is made 
impracticable by computational difficulties. 

Under such conditions we may then find it useful not to treat the 
behaviour of any single system, but to consider instead the properties 
of a collection or ensemble of systems, each of the same character as 
the one of actual interest, but distributed in the coordinate momentum 
phase space with a continuous range of values for their coordinates 
and momenta, q and p. Erom a study of the average behaviour of 
the systems in a properly chosen ensemble we may then feel able to 
draw conclusions as to the average or expected behaviour of a single 
system of interest. 

Quite similar situations are also encountered in the quantum 
mechanics. To be sure, the state of the system must then be regarded 
as specified at any time by a probability amplitude, such as 4^{q,t), 
t), or a{k, t), which will in general only permit statements as to the 
probability of finding different values of the coordinates and momenta 
q and p or other quantities of interest. Such probability amplitudes, 



326 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. IX 

neverthdess, wiU themselves change vath the time in the case of an 
isolated system in the definite manner dictated by the Schroedinger 
equation, so that in principle it is also possible in the quantum 
mechanics to make precise predictions as to the changes in the state of 
a system as time proceeds. Nevertheless, here too, just as in the case 
of the classical mechanics, we may encounter situations where it would 
not be practicable nor profitable to try to treat the precise state of 
a system as time proceeds. And under such circumstances we may 
again resort, as in the classical mechanics, to the study of an ensemble 
of systems of the same kind as the one of interest but distributed over 
a range of possible states. From the average behaviour of the systems 
in such an ensemble we may then be assisted as before in drawing con- 
clusions as to the average or expected behaviour of a single system of 
interest. 

In accordance with the above we see that the classical and the 
quantum forms of statistical mechanics are both suitable for making 
predictions merely as to the average or expected values of the co- 
ordinates and momenta or other properties of a given system of interest. 
In the classical statistical mechanics this arises because we then treat 
any single' system as being in an average state for the systems in the 
ensemble as a whole. In the quantum statistical mechanics it arises 
for a double reason, not only because we treat any single system as 
being in an average state for the systems of the ensemble, but also 
because the specification of the quantum mechanical state of a single 
system would in general itself merely permit predictions as to the 
average or expectation values of ite coordinates and momenta or other 
properties. 

As we proceed we shall find it helpful to have a notaticm which gives 
specific recognition to this twofold reason for the appearance of proba- 
bilities and averages in the quantum statistical mechanics. For this 
purpose we shall continue to use the letter W to denote a probability 
arising because of a ^read in the predictions associated with the speci- 
fication of a single quantum mechanical state; and shall use the letter 
P to denote a probability arising in addition because of the introduc- 
tion of an ensemble of systems distributed over a range of such states. 
Furthermore, we shall also find it convenient to use a muglt bar to 
denote a mean value which has been obtained by the single process of 
averaging over the range of possibilities corresponding to the quantum 
mechanical state of a single system, and to use a dovblA bar to denote 
a mean value which has been obtained by the proems of averaging 



§77 INTRODUCTION OF STATISTICAL METHODS 327 

over the systems of an ensemble. In general the double bar will then 
correspond to the result of taking a mean first over the range of possi- 
bilities presented by each member of the ensemble and then over the 
members of the ensemble. Thus 

J = J dq 

would denote the mean value of the observable F for a system in the 
quantum mechanical state And 

a=l 

would denote the mean value of that observable for the X systems 

a = 1, 2 N of an ensemble. We shall use the double bar, however, 

in the precise sense of denoting a mean for the members of an ensemble 
without reference to the necessity or not for first taking a mean for 
each member. Thus in the case of a system whose state could be given 
by a probability amplitude of the form a{k,t), the symbol 

' a— 1 

would denote the mean value, over all the i^stems of the ensemble, of 
a quantity directly given for each system in the ensemble. We shall 
later see that this particular mean is a very important one. 

7S. The density matrix in quantum statistical mechanics 
In the classical statistical mechanics it is convenient to represent the 
state of a system of / degrees of fi:eedom by the position of a point in 
a 2 p-phase space of 2/ dimensions, and then to represent am ensemble 
of such systems by a ‘cloud’ of phase points distributed with the 
density p. Assuming that this density has been normalized to unity, 
so that we have 

J ... J p dq^ ... dqfdp^ ... dpf = 1, (78.1) 

we can then calculate the mean value — ^for the systems in the ensemble 
— of any function F(q,p) of the coordinates and momenta with the 
help of the equation 

f = J ... J F(q,p)pdqi...dqfdpi...dp,. (78.2) 

It has been sho^ by von Nenmannf that a quantity cafled the 

t von Nenmann, Q&tinger NachrickUn^ 1927, p. 245. See also Dirac, Froc. Ocanbridge 
PhU. SoG. 25, 62 (1929); 26, 376 (1930); 27, 240 (1930). 



328 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. IX 

d&nsiiy matrix with components can be introduced in the quantum 
mechanics to play a role somewhat similar to that of the density p in 
the classical mechanics. Since the specific expression of this matrix wili 
depend on the particular language or quantum mechanical representa- 
tion which is being used, it will be desirable to use the generalized 
language provided by the transformation theory (see § 67 (/)) in de- 
fining the matrix. For this purpose let us regard the state of each 
system, in an ensemble of similar non-interactmg quantum mechanical 
systems, as represented by a generalized probability amplitude a{n^ t). 
This quantity can be regarded as providing the coefficients for expand- 
ing the probability amplitude in coordinate language in the form 

?)> (78.3) 

k 

where the u{Jt, q) are any desired complete set of orthogonal, normalized 
eigenfunctions. Using the general a{n, t) language, the density matrix 
can then be defined by its component elements 

1 ^ == 

/>«m =» 2^ 2 (78.4) 

’ 01=1 

where we take a mean of the products a*(m, t)a(n, t) for all the systems 
of the ensemble a = 1, 2 ,..., N, and the order of the indices for 
has been chosen to agree with the conTention usually made in this 
connexion. 

Since 

would evidently give the probability of finding an individual system 
described by (78.3) in the characteristic state corresponding to the 
eigenfunction u{n,q), we can write 

■P» = = ^ = WK (78.5) 

as an expression for the probability that a system chosen at random 
from the ensemble would be found in the state w.f In accordance with 
(78.5), it will be seen that the diagonal elements of the density matrix 
are necessarily non-negative in any quantum mechanical language. 

Aa a consequence of the normalization and orthogonality of the 
«(£,gf) in (78.3), and of the normalization of ^{q,t) itself, we can write 
for each system in the ensemble 

J >l>*i>dq = '^ataj^ = l, 

t Later* in Chapter XII, we shall find it convenient to use the symbols and 
respectively in the specific senses of the * coarse-grained’ and ‘fine-grained’ probabilities 
for finding the eigenstate n in the ensemble. 



§ 78 THE DENSITY MATRIX 

and hence for the ensemble as a whole 


320 


I -P* = I Pa* = I = 1 = 1. (78.6) 

This result, which gives unit probability as the chance of finding a 
system chosen firom the ensemble in. one or another of the possible states 
denoted by k, may be regarded as the quantum mechanical analogue 
of the classical expression for normalization given by (78.1). It will be 
noted that the classical operation of integrating the density p over the 
whole of phase space is replaced in the quantum mechanics by the pro- 
cess of taking the sum over all the diagonal demenia of the density 
matrix thus obtaining the so-called trace of that matrix. 

It is also possible to use the density matrix to calculate the mean 
value of any observable F for the systems in the ensemble. For a single 
system in the ensemble, the mean value of the observable would be 
given, in agreement with §67 (6), by 

F = J dg 

fln,n ^ 

= (78.7) 

m,n 

where is defined by 

Fmn = / (78.8) 

Hence, noting the definition of the density matrix as given by (78.4), we 
can then write 

^ ~ ~ ^ ^mnPnm (78.9) 

n,m 

for the mean value of the observable for all the q^stems in the ensemble. 

This result may be regarded as the quantum mechanical analogue of 
the classical expression for obtaining the mean value of any function 
of the coordinates and momenta given by (78.2), Noticing that the 
matrix elements and are combined in (78.9) in the manner 
corresponding to matrix multipEcation, we can also rewrite this expres- 
Sion in the form f = 2 [FpU = J [pF^ (78.10) 

m n 

as the sum of the diagonal elements of the matrix given by [Fp]^ which 
is itself the product of the two matrices given by ^kn. and Hence, 
in this quantum mechanical analogue to the classical equation (78.2), 
we again see that the integrand over all of phase spsuoe of a classical 
quantity is replaced in the quantum mechanics by the trace of the 
corresponding quantum meohanicad matris. 

3696.25 XT ^ 



330 STATISTICAL ENSKMBLES IN THE QUANTUM MECHANICS Chap. IX 


79. Transformation of the density matrix from one quantum 
mechanical language to another 

In introdocmg the density matrix by the equation of definition (78.4) 
given above, we have used the quantum mechanical mode of repre- 
sentation corresponding to the probability amplitude a(n,t), which 
furnishes the coefficients in the expansion 

^(7. 0 = 2 0«(*. ?)• (79. 1) 

k 

It is evident, however, since we have made no use of any specific pro- 
perties of the eigenfunctions u{k,q), that we should have obtained 
results having a similar form and leading to the same physical pre- 
dictions if we had used any other mode of representation, corresponding, 
say, to the probability amplitude 6(r, t) and the eigenfunctions v(r, q) 
in the expansion 6(r, t)v{r, q). (79.2) 

r 

It will prove useftil later if we now investigate the nniiary tranaforma- 
iion between two such languages in agreement with our general treat- 
ment of unitary transformations in § 67 (e). 

Using a prime to denote the density matrix in the new language 
provided by the v{r, q), we may evidently write at once, in accordance 
with our previous equation (67.32) for transforming between the proba- 
bility amplitudes b{r,t) and a{k,t), 

— ^Poe ^a^Ptk^ (79.3) 

as the equation of transformation for the components of the density 
matrix from the old to the new language. The expression for trans- 
forming the density matrix is hence of the same form as that given by 
(67.34) for transforming the matrix oorresponding to an ordinary 
observable from one quantum mechanical representation to another. 

We can also use the relations for a unitary transformation to show 
that the new language is equivalent — as it must be — to the old for 
the purpose of drawing physical conclusions. This equivalence may be 
r^acded as guaranteed by the general principle of the invariaTice of the 
trace of a matrix und^ unitary transformation of its representation, 
sinoe, as we have seen above, physical conclusions are drawn in the 
present conn^on by taking the trace of some appropriate matrix, 
nevertheless, it may be informing to examine the matter in detaiL 
We may first show that the new density matrix, in the language 



§79 TRANSFORMATION OF DENSITY MATRIX 331 

corresponding to the new eigenfonctions r,, would be properly normal- 
ized. For the total probability in that language of finding a system 
chosen from the ensemble in one or another of the characteristic states 
corresponding to the v^, we shall have 

Ip; = 2^- (79.4) 

r r 

In accordance with (79.3) and (67.30), however, we can write 

^Pn= ^ Plk = I Pit Sit = I Pfci> (79.5) 

r rjc.l *3 fc 

which demonstmtes that we shall have the same normalization, e.g. to 
unity, in the new language as in the old. 

We may also show that the new language provides, as it must, the 
appropriate expressions for the mean, values of observables. Using 
that language, the mean value of an observable F for the ^sterns of 
the ensemble should be given by 

^ = (79.6) 

where = J »* dg (79.7) 

would be the matrix components corresponding to the operator F in 
our present language. Making use, however, of previous relations given 
by (79.3), (79.7), (67.29), and (67.28), we can write 



s,r|fc,2 ^ 

= ^Pa! 

(79.8) 

which shows, in view of (78.9) and (79.6), that we shall obtain — as we 
must — ^the same mean values for an observable in either language. 

The two results (79.5) and (79.8), which can be written in the forms 

Ip; = Pitt 

and 2[^’'p'3„- = |[i’pltt, 

are particular cases of the general invariance of the trace of a matrix 
towards unitary transformation, which was mentioned above. 

In the case of a finite Hermitian matrix, in accordance with a well- 
known theorem, it would necessarily be possible to make a unitary 



332 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. IX 


transformation to a representation which will put the matrix itself into 
diagonal form. Zt will be reasonable to assume the same possibility for 
the density matrix even though it has an infinite number of components. 
Having made such a transformation, the diagonal terms of the density 
matrix may then be called its eigenvalues. Since we have already seen, 
by (78.5) and (78.6), that the diagonal elements of the density matrix are 
necessarily non-n^ative in any language and have a sum equal to unity, 
we may now state the general principle that the eigmvcduea of the. 
density matrix are necessarily non-negative with a sum equal to unity, 

Pa>% = 

Among the possibilities of transforming the density matrix from one 
quantum mechanical language to another, we have, of course, the pos- 
sibility of transforming from the original generalized language, corre- 
sponding to the probability amplitude a[n^t) in terms of which we 
defined the density matrix to coordinate language, corresponding 
to the familiar probability amplitude i). It will be of interest to 
show that the transformation to this language does lead to the expected 
form of expression for the density matrix. It will be sufficient for this 
purpose to consider a one-dimensional case. 

For the components of the transformed density matrix we shall have 

Psr = :^Pa.S%S^, (79.10) 

in accordance with (79.3), where the components of the transformation 
matrix will have the values, in agreement with (67.29), 

= 8,g. = j u^v^dq, (79.11) 

ufq) and u*{q) being eigenfunctions for the states determining the 
untransformed language, and vf(q) and v,.(q) being eigenfunctions for 
the states determining the transformed language. In order that the 
transformation shall actually be to a coordinate representation, these 
latter eigenfunctions must be taken as cases of our previous solution 
(64.24), characteristic of eigenvalues of the coordinates, 

= ?*) and = S(ff— ?,), (79.12) 


where we use the delta fimetion of Dirac. Substitutmg into (79. 1 1), we 
then obtain 


and 


= J «?(?)S(2-ffr) dj = i4(qr). 


(79.13) 


where and «jf (3r) values of the eigenfunctions, determining 



§79 TRANSFORMATIOX OR DENSITY MATRIX 333 

the original language, at the points jg and g, respectively. Substituting 
into (79.10), and remembering the original definition for />*., as given 
by (78.4), we then obtain 

= (79.14) 

where the last form of writing is made possible by the form of expan- 
sion (78.3) for the probability amplitude ^(y, t) in terms of the eigen- 
function %(g). 

We thus see that the density matrix does have the expected form in 
coordinate language in terms of the probability amplitudes 
and ^(g'g, t) for the coordinate eigenvalues and jg. And the treat- 
ment could easily be extended to systems of more than one d^ree of 
fireedom. 

In agreement with this finding it is evident, of course, that we could 
have started with a definition of the density matrix in coordinate 
language. Since our considerations were going to be very general in 
character, however, it seemed best to start with a definition in the very 
general language provided by the transformation theory as discussed 
in §67 (/). Furthermore, this general language, by giving no specific 
recognition to the circumstance that continuous as well as discrete 
spectra of eigenstates must actually be considered, really provides a 
simpler formalism for the treatments that we must undertake, than 
coordinate language with its definite recognition of the circumstance 
that the eigenvalues actually do exhibit in that case a continuous 
spectrum. 

80. Density matrix corresponding to a pure state 

The preceding sections have shown the ijse of the density matrix p„„ 
in describing the distribution of the quantum mechanical systems which 
compose a statistical ensemble. A case of particular interest aris^ when 
all the systems in the ensemble are in the same state. We shall then 
say that the ensemble is in a pure state; namely, that state which is 
common to all the members of the ensemble. In making statistical 
applications, an ensemble in a pure stale can be regarded as representing 
an individual s^tem whose exact quantum mechanical state is known; 
and an ensemble in an appropriate mixed state can be regarded as repre- 
senting a system whose complete possible quantum mechanical speci- 
fication is not known. 



334 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. IX 

In the case of an ensemble in a pure state, the density matrix -will 
evidently reduce to -<<•.. (80.1) 

since at any given time all the systems in the ensemble will have the 
same values of and o„. The density matrix for a pure state will or 
can, of course, still he normalized to unity with 

= ( 80 . 2 ) 

m 

and equations of the same form as (78.9), 

= 2 ^mnPnm> (80.3) 

n,m 

can then still be used to calculate the mean value of any observable F, 
where a double bar over the F is now no longer really necessary since 
the mean is the same for each member of the ensemble. 

When an ensemble is in a pure state the density matrix will satisfy 
the special relation 

PnkPkm = == <«*« = Pnn> (80.4) 

since ^ of Oj. will be equal to unity. Hence we may take the equality 
of the square of the density matrix with itself, 

[P%,n=-Pnn, (80.6) 

as a necessary requirement for a pure state. By considering the possi- 
bility of transforming to a representation in which the density matrix 
is diagonal, it will be seen that this requirement can only be satisfied 
if the eigenvalues of the density matrix are all equal to 0 or 1. S^om 
this it is evident that one and only one of the eigenvalues will be unity, 
since their sum also has to be equal to unity by (79.9). This means 
that all the ^sterns in the ensemble must be in the same state, and 
hence (80.6) is a sufficient as well as a necessary requirement for a pure 
state. In the general case pnm~\.P^\m always be a non-negative 
matrix, since the eigenvalues of can in no case be greater than unity- 

The distinction between pure and mixed states is of considerable 
importance for statistical mechanics and its applications to the inter- 
pretation of thermodynamics. 

If we have the maamuA htunoledge of a system allowed by the quan- 
tum mechanics, the system at any time of interest will be in a perfectly 
definite quantum mechanical state, and can then be represented by an 
ensemble in a pure state, each member of the ensemble being in the 
same state as the system itself. The only uncertainties as to its pro- 
perties and the only needs for taking averages will then arise because 



§80 


PURE AND MIXED STATES 


335 


of the inherently statistical character of the quantum mechanics itself, 
for which there is no classical analogy. On the other hand, if our 
knowledge of a system is less than maximal so that we represent it by 
an ensemble in a mixed state, then the system itself might be in any 
one of the various quantum mechanical states represented in the en- 
semble. Further uncertainties and needs for taking averages will then 
be present, of the same kind as encountered in the classical mechanics, 
due to the distribution over different states. The distinction between 
pure and mixed states is thus related to the nec^sity for taking two 
kinds of averages which we have already emphasized earlier. 

When we come to the statistical mechanical interpretation of 
thermodynamics we shall find it possible to relate the entropy of sj^s- 
tems to the statistical properties of the ensembles suitable for their 
representation. The distinction between pure and mixed states wiU 
then be important, since we shall find the rational zero-point for the 
entropy of a system to be that amount of entropy which it has when 
the corresponding representative ensemble is in a pure state. 

81 . The analogue of Liouville’s theorem in quantum statistical 
mechanics 

We have previously shown that the density matrix for an en- 
semble of q\iantnm mechanical systems can to some degree at least be 
regarded as the quantum analogue of the density p for a classical 
ensemble. We may now investigate the rate of change of the dements 
with time. We shall thus obtain the quantum analogue of Liou- 
ville’s theorem giving the dependence on time of the classical density p. 

Although an equation for the time dependence of the elements p^ 
of the density matrix can be written in a general form applicable in 
any quantum mechanical language, the ^cific content of that form 
will depend on the particular mode of representation employed. In 
practice we are often interested in representing the state of a eystem, 
either by the coefScients for an expansion in terms of the true steady 
state solutions for the system, or by the coefficients for an expansion 
in terms of a set of approximately steady state solutions which corre- 
spond to the Hamiltonian of the system after neglecting a small per- 
turbation term. For this reason we shall first treat the dependence of 
p^^ on time for these ^dal cas^, and postpone the general treatment 
until the end of the present section. 

{a) Time dependence of density matrix in language provided by true 
energy states. Let us consider a system with the Hamiltonian operator 



336 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. IX 


H. With the help of this operator we can determine a set of eigen- 
functions «■„(?) corresponding to the various possible eigenvalues of 
the energy of the system by considering the allowable solutions of the 
equation H«„( 3 ) = (81.1) 


Any state of the system can then be expressed as an expansion in terms 
of these eigensolutious, having the form 

M (81.2) 

w'here the c* are constant coefficients, and tho probability amplitude a{n, t) 
corresponding to the energy level language is seen to be given by 

a(n, t) = c„ (81.3) 


In accordance with this expression, the density matrix with our 
present mode of representation will have the components 

Pnm = (81.4) 

where we take a mean over aU the systems of the ensemble. Since the 
coefficients are constants, the dependence of the density matrix on 
time will then be given by 

(81.6) 


The form of the time dependence of the density matrix is hence very 
simple in our present language. We shall see later that (81.5) is a 
special case of the general formula for time dependence in any quantum 
mechanical language, and that this general formula famishes a natural 
analogue for liouville’s theorem in the classical mechanics. 

(6) Time dependence of density matrix in language provided by unper- 
turbed energy states. Let us next consider that the Hamiltonian 
operator for our system can be treated as tiie sum of two parts, 

H = (81.6) 

the unperturbed Hamiltonian operator H® which corresponds to a nearly 
complete expression for the energy of the system and the pertnrbaHon 
operator V which corresponds to the remainder. With the help of the 
unperturbed Hamiltonian we can now determine a set of unperturbed 
energy eigenfunctions u^{q) corresponding to the various possible eigen- 
values for the energy of the unperturbed system, by considering 
the allowable solurions of the equation 

HX(?) = 


(81.7) 



§81 


ANALOGUE OF LIOUVILLE’S THEOREM 


337 


Any state of the system can then be expressed as an expansion in terms 
of these unperturbed eigenfunctions, having the form 


*) = ( 81 . 8 ) 

where the Cj. must now be allowed to vary with the time in the manner 
actually demanded by the Schroedinger equation, since the 
E% are not now quite the true steady state solutions and energy levels 
for the actual system. The probability amplitude a{n,i) corresponding 
to our present unpertmbed energy level language is seen to be given by 

a{n, f) = c (81.9) 

In accordance with this expression the density matrix will now have 
the components 

/>nm = ^ = ( 81 . 10 ) 

where we again take a mean over all the systems of the ensemble. 

Differentiating with respect to t, the time dependence of the density 
matrix will now be given by 


^ Pnm 

U 


2wi= 
h ‘ 




(£»-£«)+ 







( 81 . 11 ) 


In our consideration of the method of variation of constants in § 68, 
however, we have abeady obtained for the time dependence of the 
coefficients the expression 


8c„_ 2ff»'sr'p. 
It — 


(81.12) 


with l^jfc = / *^(3)V«*(?) d? (81.13) 

and F„*=F£.. (81.14) 

Using these results in connexion with (81.11), we can then express 
the time dependence of the denmfy matrix in the desired form 


-^(j;0p„„-p,„i;S.)-^‘2(W*«.-p,a:Fto,.)- (81.15) 

h 

This expression, as compared with (81.5), contains an additional group 
of terms because of the fact that and are now unperturbed 
rather than true energy levels for the system. 

l^e present language is one of those most frequently used in statisti- 
cal mechanics. The language is i^ecially advantageous when the effects 

3SSS.35 X X 



S38 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. IX 

of the perturbation operator V actually are small, since approximate 
integrations can then be carried out (see Chapter XI) which will let 
us study the transitions of a system between nearly steady states. For 
example, in the case of a dilute gas H® can correspond to the energy 
of the molecules, considered as not affecting each other, and V to their 
energy of interaction. This then provides an appropriate language for 
treating the transitions between those nearly steady states of the gas 
which would persist as permanently steady states in the absence of 
collisions. 

(e) Time dependence of density matrix in general language. To obtain 
a more general treatment of the time dependence of the density matrix, 
let us now consider an expansion for the state of our system in the 
,«itegen«rJform ^(,,() = (81.16) 

where the may now be any complete set of normalized, orthogonal 
eigenfunctions for the system, and the a]f{t) are the corresponding 
probability amplitudes. 

For the rate of change of a^(t) with time we then have the generalized 
Schroedinger equation (67.13) 



Sa„ 2Tri „ 

k 

(81.17) 

with 


(81.18) 

and 

^nk — 

(81.19) 


With this apparatus the rate of change of p^m in cmr 

present quite general language assumes the form 


k 

which can be more simply written as 

2 (81.20) 
k 

Two remarks may be made in connexion with this result, which gives 
a general expression for the quantum analogue of Idouville’s theorem. 

In the first place, it will be immediately appreciated that our pre- 
vious expressions (8L5) and (81.15) for the rate of change of the density 
matrix with time are really special cases of this more general result. 
They differ in apparent form from (81.20) merely because they contain 
the explicit expressions for and that correspond respectively 



§81 


ANALOGUE OF LIOUVILLB'S THEOREM 


339 


to the languages provided by the true and by the unperturbed energy 
states for the system. 

In the second place, if we rewrite (81.20) in the form 


%* = -52 (* 1 - 21 ) 


or, in matrix language. 


ail 


u 




we see, comparison with our previous equation (19.9) for the 
dependence of the classical densitj: p on time, that the quantities 
{2irlih) [p, and (Sw/iA) |;p, H appearing in the above equations maj’’ 
be regarded in the present coimexion as the analogues of the classical 
Poisson bracket {p,H}. The similarity will be noted with previous 
analogues for Poisson brackets as given by (63.3) and (67.27). It will 
also be noted that we take the Poisson bracket or its analogue with 
a positive sign to obtain the time dependence of the exact or expectation 
value of a function of the coordinates and momenta for a single system, 
and with a negative sign to obtain the time dependence of the density 
or density matrix for an ensemble of such systems. 


82. Conditions for statistical equilibrium 
For the purposes of statistical mechanics we shall be specially in- 
terested in ensembles which are in statistical equilibrium; that is, 
having such a distribution of systems among the different possible 
quantum states that the components of the density matrix p„^ do not 
change with time. In analogy with the classical mechanics this result 
can be achieved either by taking an initial distribution with the density 
matrix set equal to a constant, or by taking a distribution with the 
density matrix set equal to some ffmction of the matrix for any con- 
stant of motion for the system such as its energy. We may give separate 
attention to these two possibilities. 

(a) Density a constant. Let us begin by considering the possibility 
of setting the density matrix equal to a constant. The component 
elements of the density matrix will then be given by 

Pnm = PoKm> (82.1) 

the non-diagonal elements of the matrix all being equal to zero, and the 
diagonal elements aU equal to the same quantity p,,. 

We may readily show that the distribution given by (82.1) would 



340 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. IX 

persist as time proceeds. To see this, we have as our general equation 
(81.20) for the dependence of the density matrix on time 

= “X 2 (82.2) 

k 

And substituting (82.1), we obtain 

^ = _^Po 2 

k 

= 0. (82.3) 

We thus find, by starting with the initial distribution = PoKm> 
that we do obtain an ensemble which retains that same distribution 
independent of time. We shall see later that the properties of such 
a permanent uniform distribution will be very important in setting up 
the fundamental postulate of the quantum statistics as to equal a 
priori probabilities and random a prion phases. 

(&) D^rsily a function of a consteuit of the motion. We may now turn 
to the more general possibility of securing statistical equilibrium, by 
setting the density matrix HpH equal to a function of a matrix ||All 
corresponding to any constant of the motion for the kind of system 
under coimderation. 

For this purpose we may begin by taking 

FnJf) = J ^ (82.4) 

as a general expression, in the language given by the eigenfunctions 
«„(g), for the elements of the time-dependmt mainx ||F(<)||, correspond- 
ing to any dynamical variable for the system. The symbols F and H 
in this expression denote the operators in coordinate language which 
correspond to the variable of interest F and to the energy of the system 
H, and the exponential opmators and may be regarded 

if desired as a shorthand expression for the corresponding series expan- 
sions, as will be discussed in more detail in § 96. It will be noted that 
the equation of definition (82.4) is a particular case of our previous 
general expression (67.23) for time-dependent matrices, with 

and tfi{q, f) = 

as the two — ^formally correct — solutions of the Schioedinger equation 
which are now of interest. 



STATISTICAL BQUILIBKIUM 


341 


Differentiating (82.4) with respect to the time t, assuming that H 
and F have no explicit dependence on t, we obtain 

= ^[HF{t)-FmU 

== X 2 (82.5) 


where the first form of writing depends on the commutabiKty of the 
operator H with any fiinction of itself, and the second on the general 
relation between quantum mechanical operators and the corresponding 
matrices. It will be noted that (82.5) is a particular case of our previous 
equation (67.27). 

Equation (82.5) gives a general expression in our present language 
for the time dependence of the matrix elements corresponding to any 
djmamical quantity F. We may now define a constant of the motion A 
as being a quantity which has an operator A which commutes with that 
for the energy H, HA = AH. (82.6) 

In accordance with (82.5), we shall then have, for such a quantity, 
matrix elements which do not change with time and hence can be 
regarded at aU times as given by the simplified expression 


= ^nm = J «J(?)Att„(g) dq. 


(82.7) 


Eetuming now to liouville’s theorem (81.20) for the rate of change 
in the density matrix with time, 

^ = -X 2 (82.8) 


(82.8) 


we see that this rate of change will be zero for an ensemble set up with 
the density matrix put equal to any function of the matrix for a con- 
stant of the motion, ^ [f(A)]„„, (82.9) 

since the right-hand side of (82.8) will then be zero in accordance with 
the commutativity expressed by (82.6). Hence we can now achieve 
statistical equilibrium in this quite general way. As in the classical 
statistical mechanics, we shall be specially interested in cases of statisti- 
cal equilibrium which are obtained by taking the density as a function 
of the energy, _ [f{H)U, (82.10) 



342 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. IX 


83. The uniform, microcanonical, and canonical ensembles in 
the quantum mechanics 

We may now consider in some detail the properties of different kinds 
of ensemble used in the quantum mechanics, which may be regarded 
as the analogues of the uniform, microcanonical, and canonical en- 
sembles familiar in the classical mechanics. 

(a) The uniform ensemble. The jBrst of these is the uniformly dis- 
tributed ensemble with the elements of the density matrix given by 

Pnm — Po^»m» (83.1) 

where po is a constant. As we have already seen in the preceding sec- 
tion, such a distribution, if once set up, would remain unaltered as time 
proceeds, and hence the uniform ensemble has the important property 
of being in statistical equilibrium. 

In the case of such uniform ensembles, since there will be equal 
probabilities of finding a system chosen from the ensemble in any one 
of an infinite number of different states, it is not profitable to try to 
normalize the density matrix to unity. Instead, we may, if desired, take 

Pnn = %, = Po> (83.2) 


with pq finite, as expressing the rdative probability of findin g a system 
chosen at random from the ensemble in state n, such relative proba- 
bilities in this case being equal for all the different states. 

An important property of the uniform ensemble lies in the circum- 
stance that the density matrix for this particular ensemble would have 
just the same form of expression in different quantum mechanical 
representations. To show this we have as our general formula (79.3) 
for transforming the density matrix to a new quantum mechanical 

p', = InkSSS;,. 


and, substituting (83.1), this gives us 


— Po^ ^ Po^sr9 (83.3) 

where the last form of writing depends on the fundamental property 
given by (67.30) for the matrix elements ^kr unitary 

transformations. 

In accordance with (83.3), we now see that an ensemble set up to 
have a uniform distribution in one particular quantum mechanical 
language wotdd not only retain a uniform distribution for all time in 
the original language, but would also have a uniform distribution, and 



§83 


UNIFORM ENSEMBLES 


343 


with the same magnitiide of the constant pQ, in anv language. This is 
the quantum mechanical analogue of the classical finding, in §22 (a), 
that a Tinifonn ensemble would not only be in statistical equilibrium 
but would be distributed with the same density p in the phase spaces 
corresponding to any choice of classical canonical variables. 

To complete this discussion of uniform ensembles in the quantum 
mechanics, a certain point of difference, as compared with uniform 
ensembles in the classical mechanics, must now be considered. In the 
classical mechanics it was clear that a uniform distribution of systems 
in the phase space, with 


p — const.. 


(83.4) 


would necessarily imply the same number of systems in each of the 
groups of states corresponding to the equal regions ... Spy into which 
we might think of the phase space as divided. In the quantum 
mechanics, however, the uniform ensemble, expressed by 


Pnm — Po^n 


(83.5) 


cannot be regarded as carrying necessary implications as to the num- 
bers of systems in different states, owing to the new possibility for the 
superposition of quantum mechanical states. 

To investigate this we may return to our original definition (78.4) 
for the density matrix and express this in a form to show the 
dependence of the density matrix on the magnitudes and phases of 
the probability amplitudes a* and o„ for the different states of the 
systems in the ensemble. For this purpose we write 


= »-n»‘B.{cos(^„— ^J+*sin(^„-^„)}, (83.6) 

where and r,„ are the absolute magnitudes, and <f>„ and are the 
phases, of the probability amplitudes for the states n and m of a system 
in the ensemble, and the double bar indicates that we are to take a 
mean of the quantity indicated for all the systems in the ensemble. 
For the case of a uniform ensemble, the expression given by (83.6) must 
then reduce to 

r„ r„{cos{^„-^„)+i sin(^„-^ J} = pg S„„ (83.7) 

for any choice of states n and m. 

It is evident, however, that this result would not be sufficient to give 
unique information as to the distriburion of the systems of the ensemble 



344 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. IX 


in different states, since (83.7) could evidently be satisfied by taking 
the magnitudes r„„ etc., for the different systems in any manner 
which would lead to the mean results 



(83.8) 


and by taking the phases etc., in any manner which would lead 

to the mean results 


r^r„cos(^„-^J = 0 and r„r„,sin(^„-^J = 0, (83.9) 

when n and m are not the same. Both of these results could be achieved 
in a great variety of ways. In particular it is to be noted that (83.9) 
would be satisfied by any random selection of phases which would make 
the above means zero on account of the equal weighting for positive 
and negative values of coa{4>n—i>m) 8in{^„— ^^) when n and m are 
not the same. Hence the fulfilment of (83.7) would not be sufficient 
to describe the superposition of different eigenstates n, m, etc., which 
might prevail for any single system in the ensemble. 

In accordance with the foregoing, it is customary to describe the 
uniform ensemble as corresponding to equal probabilities and random 
phases for the different eigenstates which define the quantum mechani- 
cal representation that is being employed, where the term random 
phases means that there is no special selection of phases for the different 
states of the different members of the ensemble which would lead, to 


non-diagonal terms in the density matrix. As a consequence of our 
earlier discussion, it is evident that these properties of equal probabilities 
and random phases for different e^enstates in the case of any uniform 
ensemble would persist both as time proceeds and on transformation to 
other quantum mechanical language. 

It may be specially emphasized that the mere prescription of equal 
probabilities for different eigenstates in a particular quantum mechani- 
cal language, without also prescribing random phases in that language, 
would not be sufficient to secure these important properties of the 
maintenance of equal probabilities for different eigenstates as time 
proceeds, and the similar maintenance on transformation to other 
quantum mechanical languages. If we had contented ourselves with 
defining the uniform ensemble in a particular quantum mechanical 


language by 


Pmn ~ ^nm)> 


we dbould have indeed secured equal probabilities for the different 
states n determining that language. Nevertheless, our proof of statisti- 
cal equilibrium for the ensemble, as given by (82.3), would no longer 



§83 


triTIFOBM ENSEMBLES 


345 


be possible since in general and j]£;j •would not commute, and our 
proof of invariance against transformation, as given by (83.3), -would 
no longer be possible, since in general we should not have a reduction 
in form which would permit us to use the simple property of trans- 
formation matrices. 

Just as in the classical statistics, the quantmn mechanical uniform 
ensemble cannot be r^arded as a representative ensemble for a system 
concerning the condition of which we have any information, since the 
ensemble gives equal probabilities for all possible different states. As 
in the classical statistics, however, we shall find the properties of the 
uniform ensemble very important when we come to set up the funda- 
mental postulatory basis for statistical mechanics. 

(6) The microcanonical ensemble. We must next consider the ana- 
logue of the classical microcanonical ensemble, which will be defined — 
also in the quantum mechanics — ^in a manner to give uniform proba- 
bility of finding members of the ensemble in energy states lying 
within a narrow energy range and zero probability for finding members 
of the ensemble outside that range. To obtam such an ensemble it -will 
be necessary to set the density matrix equal to an appropriate function 
of the energy matrix for the kind of system under consideration. It -will 
be advantageous to begin by m aking some preliminary general remarks 
as to the nature of such fimctional relationships between two matrices. 

The possibility of setting one matrix equal to a function / of another 
depends on the possibility of regarding the function / as having a form 
which can be represented -with sufficient accuracy by a power series in 
its argument. With the help of the operations of matrix multiplication 
and addition it -will then be possible to obtain explicit expressions for 
the elements of the first matrix in terms of those of the other. Thus 
we may relate the density matrix to the energy matrix by the expression 

IWl =/l|Hll, 

pro-rided we r^ard this as equivalent to an expansion of the form 

IHI = flo+oi Ilffll+Oa liH|P+as 11311*+-. (83-10) 

where the quantities Ug, %, a^, Og, etc., are constant coefficients. As 
explicit esqtresrions for the elements of the dmisity matrix in -terms of 
the elements of the enra^ matrix, -we can then write, in accordance 
-with the rules for matrix multiplicafion, 

Pam = ^ (83.11) 

yy 



38W.3S 



346 STATISTICAL ENSEMBLES nr THE QUANTUM MECHANICS Chap. IX 
and this can also be expressed in the abbreviated form 

f.. = um.„ <83-i2) 

if we so desire. 

The above expressions assume a very simple form if we make use of 
the mode of representation provided by the energy language. The 

energy matrix will then be diagonal with elements given by 

dq 

= (83.13) 

owing to the circumstance that will now be an energy eigen- 
function, and owing to the normalization and orthogonality of such 
eigenfunctions. Substituting expressions of the form (83.13) into 
(83.11), we can then write 

Pnm ~ ®o3»m+ai.S^ro3«TO+02 ^ 8jfcm+ 

+®3 ^ El . . . 

~ (83.14) 

and this can also be expressed in the abbreviated form 

Pnm=f(EJS.n^. (83.16) 

In the light of this preliminary discussion we can now easily define 

the quantum mechanical microeanmical ensemble by requiring a form 
for the function f{E„) such that the density matrix expressed in the 
eneigy language will be given by 

Pnm = PoKm (E^ “ range E to E+8E), 

Pnm = ^ {Em uot in range E to E+8E), 

where po is a constant, and E to E+8E denotes some particular small 
range of interest in the eneigy of the systems under consideration. This 
simple form of expression holds, of course, only in the energy language, 
and more complicated forms of expression in which the density matrix 
would exhibit non-diagonal as well as diagonal terms would be found 
in general on transformation to other modes of representation. 

In accordance with the above definition there would be equal 
probaMitieB, = 

P«« = W; = Po. (83.17) 

for finding a system sdected at random from the ensemble in each of 
the eigenstates » for an eigenvalue of the energy E^ lying in the range 



S83 


MICROCAXONICAL ENSEMBLES 


347 


E to E-\-ZE, and zero probability for finding a system with energy 
outside that range. In the case of degeneracy, each eigenstate corre- 
sponding to the same energy is to be treated as a separate state, having 
its appropriate quota of systems. 

In accordance with § 82 ( 6 ), the microcanonical ensemble has the 
important property of being in statistical equilibrium, since the density 
matrix has been set up as a function of the energy matrix, and our 
actual interest will lie in conservative systems for which the energy will 
be a constant of the motion. Hence the expression for the density 
matrix, as given in the energy language by ( 83 . 16 ), will remain un- 
altered as time proceeds, and the probability of finding the different 
energy states will remain permanently equal to or 0, according as 
the energy does or does not lie in the range E to E+hE. This is in 
agreement with the quantum mechanical form of the principle of the 
conservation of energy, which makes the relative probabilities of finding 
different values of the energy remain constant with time for each 
individual system in the ensemble. 

The microcanonical distribution will prove useful, in the quantum 
mechanics as in the classical mechanics, as a representative ensemble 
for an individual system in a steady condition of equilibrium with its 
energy content sufficiently well determined to be assigned to the 
specified range E to E-\-hE. Since a determination of the energy of 
a quantum mechanical system of interest would be subject to the 
Heisenberg uncertainty, » 

£^E « ( 83 . 18 ) 

where is the time available for observation, it will be appropriate 
to regard ZE as large compared with AH. In the quantum mechanics 
there will be even less reason than in the classical mechanics for con- 
sidering sur&ce ensembles with all the systems havmg precisely the 
same energy, since a precise determination of energy would necessitate 
an infinite time of observation and in the absence of degeneracy would 
put the ^stem into some single energy eigenstate. 

(c) The canonical ensemble. It is also useful in the quantum statistics 
to have an analogue of the classical canonical ensemble. This may be 
defined in general by setting the denmty matris ||p[i equal to a function 
of the energy matrix ||£r||, for the systems under consideration, having 
the form 4-irar 

lbll = e • . (83.19) 

where ^ and 6 are parameters which determine the particular distribution 



348 STATISTICAL ENSEaiBLBS El THE QUAJmJM MECaaiANICS COiap-IX 


of interest, and we regard the espr^sion given as equivalent to the 
series expansion 

IlHIi , 1 ilHlP 1 


iWi = 


e ' 2! 0® 3! 


(83.20) 


In accordance with the rules for matrix multiplication, we can then 
write as an expression for the elements of the density matrix 


^ * & * k,l ^ 

(83.21) 

valid for any language in which the are expressed. 

In the energy language the elements of the energy matrix will assume 
the simple form 

Hnm = J «!J(?)HMm(?) dg = (83-22) 

and the above expression wiU reduce to 


Pnm 




.^ + 1 

e ^2 


0* ■ 


1 El 

171 I 

'3! 0* 


•} 


This can then be conveniently rewritten as 

(83.23) 

The equation of definition for the canonical ensemble thus assumes a 
very simple form in the language corresponding to the energy eigen- 
values E„ for the system. 

In accordance with the above definition of the canonical ensemble, 
the probability for finding a system selected at random from the 
ensemble in any specified energy eigenstate n would be given by 

Pnn-=%== e~,, (83.24) 

where each eigenstate is to be treated as a separate state in the case 
of d^neracy. Since the total probability for finding a system in one 
or another state will be taken as normalized to unity, 

2p«» = 2^« = 2e~‘ = 1. (83.25) 

n n n 

the distribution parameters >j/ and 9 will be subject to the ration 

e-*l^ = 2 (83.26) 

n 

where we take a summation over all energy eigenstates n. Owing to 
a definition obtained by setting the density matrix equal to a function 
of the energy matrix for the systems under consideration, the canonical 
ensemble will have the important properly of being in statistical eqvMir 



§83 


CANONICAL ENSEMBLES 


349 


brium, in the case of conservative systems, and the distribution given 
in the energy language by (83.23) will remain unaltered as time pro- 
ceeds. 

The close analogy between the properties of the quantum canonical 
ensemble as defined above and of the classical canonical ensemble as 
defined in § 22 (c) will be noted. As was shown for the classical 
mechanics by Gibbs, the canonical ensemble will prove especially useful 
in the quantum mechanics when we come to consider the relations 
between statistical mechanics and thermodynamics, and the canonical 
distribution will then be found to provide the appropriate representative 
ensemble for a system of interest having a specified temperature. 

The foregoing examples of the uniform, microcanonical, and canonical 
ensembles have been selected to illustrate the general nature of en- 
sembles in the quantum mechanics because of their definite character 
and of their usefulness for our later considerations. It will be appre- 
ciated, however, in the quantum statistics as in the classical statistics, 
that we are in no way limited to ensembles which are in statistical 
equilibrium, and that ensembles where the distribution is changing with 
time wiU be of great importance when we come to consider systems of 
interest which are not themselves m a steady condition of macroscopic 
equilibrium. 

It may be well to call attention at this point to the consideration 
that ensembles in the quantum mechanics are, of course, to be con- 
structed, in the case of systems containing essentially similar particles, 
in such a manner as to exclude non-aocessible states not having the 
symmetry properties known to occur in nature. This circumstance, for 
which there was no classical analogy, provides the explanation for im- 
portant differences between classical and quantum statistical results, 
as we shall see later. It is also to be noted in the case of dynamical 
variables having no classical analogy — e.g. spin — ^that ensembles must 
be constructed so as to include the states that correspond to the 
different possible eigenvalues of such quantities. 

84. The fundamental hypothesis of equal a priori probabilities 
and random a priori phases for the quantum mechanical 
states of a system 

We must now turn to a consideration of the fundamental hypothesis 
of the quantum statistical mechanics. In the quantum as in the classical 
case, the necessity for some postulate, in addition to those sufficient 
for an exact mechanics, axis^ when we desire to give a reasonable 



350 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. IX 

treatment to the properties and behaviour of some system of interest 
in a condition which is not sufficiently specified to give a precise deter- 
mination of state. Under such circumstances we then resort to an 
appropriately chosen representative ensemble of systems, of similar 
structure to the one of actual interest, and regard the ayerage properties 
and average behaviour of the systems in this ensemble as providing 
estimates as to what may reasonably be expected for the actual system 
of interest. In setting up such a representative ensemble we shall, of 
coxnse, take the members of the ensemble as being in states which agree 
with our partial knowledge of the precise state of the actual system of 
interest. We need some hypothesis, however, to guide us in choosing 
the probabilities and phases for different states that agree equally weU 
Tvdth that partial knowledge. 

For this purpose we now introduce, as a postulate, the hypothesis of 
egucd a priori probabilities and random a priori phases for the gvanium 
mechanical slates of a system. This hypothesis may be taken in a general 
way as signifying, in the absence of precise knowledge of the probability 
amplitudes = r„ e^» for the different eigenstates » of a system, that 
there is no a priori reason, provided, for example, by the quantum 
mechanics itself, for assuming other than equal importance for the 
probabilities, W„ = r^, or other than random values for the phases 
of states n which agree equally well with our knowledge of the actual 
condition of the S3r8tem. 

We can obtain a more adequate idea of the meaning of the h3rpothesiB 
by appl3dng it in a specific example. For this purpose let us consider 
a situation in which our partial knowledge of the state of the system 
is r^arded as obtained from an approximate measurement of some 
observable quantity F which is a property of the system. As the equa- 
tions determining the eigenstates of the system corresponding to this 
observable, we can write-in coordinate language for specifieity — 

F%(?) = (84.1) 

where F is the quantum mechanical operator corresponding to the 
observable F, and the different possible eigenvalues of this quantity 
and associated eigensolutions ate denoted by F*. and %(?). With the 
help of these eigensolutions, any state of the system could then be 
expressed by a summation of the form 

^ = ^ OfcWjs, = rj,e^*uj^ (84.2) 

where the complex quantities are the probabiUty ampli* 

tudes for the different eigenstates h at any time of interest. 



§ 84 FUNDAMENTAL HYTOTHESIS OF THE QUANTUM STATISTICS 351 


A precise measurement of the quantity F, showing that it had a 
particular eigenvalue J*, would now tell us — neglecting degeneracy — 
that the system was in a particular one of the above states h. The 
properties and behaviour of the system could then be treated by the 
precise methods of the quantum mechanics, by setting the probability 
for this particular state equal to unity with 

= »i = 1. (84.3) 

and by taking the phase for this state at landorn, since we im- 
mediately see that its value has no effect on the properties of the 
system at the time of measurement, and see, from the fact that the 
generalized Schroedinger equation (67.13) would reduce to 


QCKf„ 

71 

dt 




2771 
fltt = 


nk'*'k 




(84.4) 


with only one surviving term of the summation, that its value would 
have no effect on the probabilities a* which we should predict for any 
state n at a later time. Cf. § 96 (c). In view of the approximate character 
of our actual measurement of the quamtity F, however, we shall actually 
have to turn to the methods of statistical mechanics and construct an 
appropriate representative ensemble for the treatment of the system. 

For this purpose we might now proceed by regarding our partial 
information as to the state of the system as equally well satisfied by 
any one of a group of neighbouring states k which have eigenvalues 
jF]^. in substantial agreement with our approximate measurement of F. 
In accordance with our hypothesis as to a priori probabilities and 
phases, we could then construct an appropriate representative ensemble 
by taking equal mean probabilities r| and random phases for the 
different states k that lie in this group of neighbouring states, and 
zero probabilities for states lying outside that group. This would then 
give us 

/ (state k m group Q^), 


Pu 




\ 0 (state k not in group G^), 


(84.5) 


as a specific expression for the density matrix of the representative 
ensemble at the time of measurement, where the quantity po is a, con- 
stant which gives the equal mean values for the probabilities of the 
states that do lie in the group G^ and any possible non-diagonal terms 
of the matrix drop out as a result of averaging over the random phases 
for those states. 

It would also be possible to proceed by regarding our approximate 



352 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. LX 

measurement as telling us that the value of F was almost certainly in 
the immediate neighbourhood of a particular eigenvalue Fq, with a 
decreasing chance for values more and more removed therefrom. In 
accordance with our hypothesis as to equal a priori probabilities and 
random a priori phases, we could then construct an appropriate represen- 
tative ensemble by taking the density matrix as given, for example, by 

Pki — Po^ (84.6) 

where pg and cr are constants. With a small value of the dispersion a, 
the probability for difrerent states k woxild then be concentrated among 
those for which was nearly equal to Fg. The two proposals (84.5) and 
(84.6), as to an appropriate ensemble for representing our approximate 
knowledge of the quantity F for the system of interest, would lead to 
practically identical conclusions in the situations commonly treated by 
statistical methods. 

As previously mentioned, we shall regard the introduction of the 
hypothms of equal a priori probabilities and random a priori phases 
as making a definite addition to our postulatory basis, which can be 
ultimately justified only by the agreement between the results to be 
calculated with its help and the results actually obtained by observa- 
tion and experiment. Nevertheless, certain remarks may be made 
beforehand as to the reasonable character of the postulate that we 
have chosen. 

Sfrst of all, emphasis may be laid on our previous finding, see § 83 (a), 
that a uniform ensemble, if once set up, would retain for all time a 
distribution which could be described by saying that it corresponds to 
equal probabilities and random phases for the different eigenstates that 
define the quantum mechanical language employed. We may hence 
conclude that the quantum mechanics does not itself have a character 
which would make one eigenstate inherently more probable than 
another, or would make any particularly ordered relation between the 
phases of different states inherently more probable than a completely 
random relation. In view of this character of the quantum mecfiianics 
itself it would then seem quite arbitrary to adopt any procedure other 
than the asaignment of equal probabilities and random phases to those 
states that do corre^nd equally well with the approximate knowledge 
of state that has been obtained. 

This may be seen more clearly if we return once more to our previous 
example in which om: olmervation of the state of the system was re- 
garded as insufficient to distinguish between the members of a group 



§ S4 FUNDAilENTAL HYPOTHESIS OF THE QUANTO! STATISTICS 353 


of neighbouring eigenstates, corresponding to eigenvalues i]t, all of 
which axe in substantial agreement with our approximate measurement 
of an observable quantity F. With regard to the probabilities of the 
different states in the group it is then immediately evident that 
it would be arbitrary to weight one of these states more highly than 
another, in constructing the appropriate representative ensemble, since 
the quantum mechanics itself has not provided any reason for thinhing 
that one state is inherently more probable than another, and our actual 
knowledge is equally well represented by any one of the states in the 
group. And wdth regard to the phases of the different states in the 
group it is also evident that it would be arbitrary to proceed other- 
wise than by the assignment of random phases, since the quantum 
mechanics has not itself provided any reason for thinking that any 
particular arrangement of phases is inherently more probable than a 
random one, and the observation actually made on the system has not 
been of a kind to give any information as to the actual phases of the 
different possible eigenstates k. 

To see that an approximate measurement of the quantity F would 
not give information as to the phases of the amplitudes aj^ for 
the eigensolutions % that correspond to the eigenvalues Fj^ of that 
quantity, let us take the actual state of the system, at the time of our 
approximate measurement thereof, as expressed by an expansion of 
the form (84.7) 

where the quantities are the actual probability amplitudes 

at that time for the different eigenstates k. For the probabilities of 
finding the different possible values of the quantity F we should 
then have, in accordance with (66.9), 

Tr(JI.)= = (84.8) 

where the phases whatever they may be, disappear from the expres- 
sion. It is hence evident that a measurement of JP, of any degree of 
accuracy, would give us no information as to the phases and would 
be equally well satisfied by any values of those quantities. It would 
hence indeed be arbitrary to take these phases in any but a random 
way in constructing an appropriate representative ensemble. 

In connexion with the foregoing remarks in justification of the assign- 
ment of random phases to states that agree equally well with our 
approximate knowledge of the condition of the system of interest, it 
will be well to emphasize the important point, that such states, which 

3596.25 25 Z 



334 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. IX 

agree equally well with our knowledge, will themselves, of course, be 
eigenstates for the particular kind of quantity which was approximately 
measiured in obtaining that knowledge. Hence the assignment of ran- 
dom phases to these states will lead to the disappearance of non- 
diagonal terms in the density matrix for the representative ensemble, 
when the matrix is expressed in the language corresponding to the 
quantity measured. On transformation to other modes of representa- 
tion, non-diagonal terms may, of course, be expected in the density 
matrix, since each of the states, agreeing equally well with our approxi- 
mate knowledge, would then be expressed in general by a summation 
over some new kind of eigensolutions each multiplied by a coefficient 
which would be definitely given as to magnitude and phase. It is also 
to be noted that the assignment of equal probabilities and random 
phases is, of course, to be regarded as appljdng at the time of our 
approximate measurement of the state of the system of interest, and 
that as time proceeds the distribution of the representative ensemble 
will then change in whatever way is demanded by equation (81.20), 
which gives the quantum analogue of Liouville’s theorem. In general 
this actually leads, as time proceeds, to a more uniform distribution 
over different states and a less random distribution of phases as we 
shall see in Chapters XI and XII. 

As a further consideration, giving a preliminary partial justification 
for our postulate as to equal a priori probabilities and random a priori 
phases for the quantum mechanical states of a system, it is to be 
pointed out that this hypothesis would stand in agreement, at the 
correspondence principle limit, with our previous classical hypothesis 
as to equal a priori probabilities for equal extensions in the phase space. 
To discuss this we note as a first point, in accordance with the trans- 
formation properties of the density matrix for a completely uniform 
ensemble as given by (83.3), that the prescription of equal o ipriori 
probabilities and random a priori phases for the eigenstates which 
correspond to any particular quantum mechanical language, would 
necessitate the same equal a priori probabilities for all quantum 
mechanical states corresponding to any language. We also note, as a 
second point, that there is a good one-to-one correlation, as we approach 
the classical limit, between different quantum mechanical states for 
a system of / degrees of fieedom and regions of extension in the 
phase space which would contain the phase points for classical systems 
that exhibit properties in substantial agreement with those of the 
quantum mechanical S 3 retem in the state considered. Such a correlation 



§ 84 FUNDAMENTAL HYPOTHESIS OF THE QUANTUM STATISTICS 355 

can be seen from §§ 62 {a) and 63 (e) of Chapter ^^I, which show the 
possibility of obtaining a quantum mechanical state at the classical 
limit, which could be described as a ware packet giving a fairly per- 
manent concentration of probability density within a region of exten- 
sion Tfi^ moving along a classical trajectory; and s imilar correlations 
are given in §§ 69, 71, 72, and 73 of Chapter VIII, where we have found 
— for large quantum numbers, i.e. at the correspondence limit — several 
examples of the association of eigenstates of energy or of angular 
momentum, with corresponding regions of extension V in the phase 
space, such that the classical systems thereby defined would have sub- 
stantially the same values of those quantities. As a consequence of the 
two points noted above, we now see that our postulate, of equal a priori 
probabilities and random a priori phases for different quantum 
mechanical states, is equivalent at the correspondence principle limit 
to the assumption of equal a priori probabilities for different regions 
of equal extension V in the classical phase space, and hence indeed in 
agreement with the classical postulate as to equal a priori probabilities 
in general for equal regions, owing to the small size of the extensions 
y from the point of view of practical measurement. 

In connexion with the above relation of quantum mechanical proba- 
bilities and phases to classical probabilities, it is interesting to note 
that a partial statement of this relation taken in the reverse sense that 
equal probabilities for equal regions in the classical phase space ought 
to imply equal probabilities for the allowed states of the older quantum 
theory, was often made basic in early attempts to foimd a satisfactory 
quantum statistics. It is also of interest to note that we can secure an 
appropriate symmetry between the statements of the quantum mechani- 
cal and classical postulates by saying that the dual quantum mechanical 
hypothesis, of equal a priori probabilities and of random a priori phases 
for quantum mechanical states, is equivalent at the correspondence 
principle limit to the dual classical hypothesis, of equal a priori proba- 
bilities for equal regions in the classical coordinate space and of equal 
a priori probabilities for equal regions in the classical momentum space. 

This discussion of the fundamental postulate of the quantum statisti- 
cal mechanics has been so long that it may be well to conclude the 
present section by stating this postulate once more in the specific form, 
that the construction of an appropriate representative enaentble^ to corre- 
spond to the knowledge we have gained as to the condition of some 
system of interest by an approximcUe measurement of some quantity F, 
is to be obtained by assigning equal probabilities and random phases. 



356 STATISTICAL ENSEMBLES Df THE QUAOTUM MECHANICS Chap. IX 

at the time of measurement, to those eigensolutions, characteristic of 
the quantity F, which agree equally weU with the approximate know- 
ledge of state given by the measurement. 

85. Validity of statistical quantum mechanics 

It will be evident from the discussion given in this chapter that 
statistical methods can be employed in the quantum mechanics for 
predicting the properties and behaviour of a system in a partially 
specified quantum mechanical state, in a manner which is essentially 
aimilar to that by which they are employed in the classical mechanics 
for predicting the properties and behaviour of a system in a partially 
specified classical state. In both cases the procedure consists in setting 
up a representative ensemble of systems of the same structure as the 
one of actual interest, and then using the average properties and be- 
haviour of the systems in this ensemble as giving reasonable estimates 
for quantities pertaining to the actual system of interest. And in both 
cases the representative ensemble is selected in such a manner as to 
agree with our partial knowledge as to the precise state of the actual 
system of interest, and is otherwise constructed in accordance with a 
reasonable and non-arbitrary postulate as to the a priori likelihood of 
different possibilities. 

In changing from the classical to the quantum mechanical applica- 
tion of this procedure, th^e are, of course, certain differences which 
must be considered. In the case of systems composed of similar particles 
we have the new feature that non-accessible states, having symmetry 
properties now appreciated as not occurring in nature, must be ex- 
cluded. Furthermore, in the case of dynamical variables having no 
classical analogue — e.g. spin — ^the ensembles must now be constructed 
so as to allow for the states that correspond to the different possible 
eigenvalues of those quantities. More important, it has to be recognized 
that our considerations wiU in general now have a twofold statistical 
character, on the one hand for the reason that the quantum mechanics 
itself will in general only provide statistical predictions for systems in 
precisely specified states, and on the other hand for the reason — 
essentially involved in the development of what we name statistical 
mechanics — ^that we ^aU actually wish to treat systems in states which 
are not precisely specified. Moreover, as will be considered more fuUy 
in Chapter XI, it is to be noted that an ensemble which has been set 
up to represent a given quantum mechanical state of interest can be 
regarded as strictly suitable for this purpose only for the time interval 



§ S5 VALIDITY OF THE QUAXTUH STATISTICS 3oT 

between one measurement and the next, owing to the disturbing effects 
of observation which we noTv recognize as not necessarily negligible. 
Finally, there is, of course, a somewhat striking change in mathematical 
formalism when we change from classical to quantum mechanics, and 
this is reflected in the apparatus appropriate for the statistical exten- 
sions given to those two disciplines, as is made evident by the change 
from the classical phase density p to the quantum mechanical density 
matrix Nevertheless, allowing for such changes in subject-matter, 
it is evident that there is a quite similar logical structure for the general 
framework within which the classical and quantum statistical methods 
are applied. 

On account of this fundamental similarity between the logical pre- 
suppositions for the classical and quantum mechanical developments, 
it will be possible also in the quantum case to take the point of view 
as to the validity of statistical considerations which was expressed in 
§ 25. This point of view may be summarized, in the present connexion, 
as follows. The methods of statistical quantum mechanics are to be 
regarded as truly statistical in character, giving results that can be 
expected on the average on repeated experiments with the system in 
question in the condition defined by its partial specification of state, 
but not results that can be precisely expected in a single trial. The 
methods lead to calculated fluctuations around the averages which turn 
out to be small in the case of typical applications, and in other cases 
can be compared with empirical findings. The methods have to be 
based, on account of their statistical character, on some hypothesis as 
to the a priori likelihood of different possibilities; the postulate actually 
introduced for this purpose is the only non-arbitrary one that can be 
selected, and agrees at the correspondence principle limit with that 
selected for the classical statistics. The methods developed do have, 
as far as is known, the a posteriori justification of agreement with 
experimental findings. It will thus be seen that an entirely similar 
point of view as to the validity of statistical mechanics is adopted, for 
the purposes of this book, in the quantum as in the classical case. 

In this connexion it is of interest to inquire into the possibility of 
adopting a different point of view as to the validity of the quantum 
statistics which would be based on some tenable quantum analogue of 
the original ergodic hypothesis. In the classical mechanics the intro- 
duction of the ergodic hypothesis, that a system in the course of time 
would pass through every point in the phase space corresponding to 
classical states compatible with its energy, made it possible to conclude. 



338 STATISTICAL ENSEMBLES IN THE QUANTUM MECHANICS Chap. IX 


as we have seen in § 25, that the probability of fin d in g a given system 
of interest at a random time in any specified state would then be equal 
to the probability of finding a system picked at random from the corre- 
sponding microcanonical ensemble in that state, and hence that the 
time average for the properties of any system of interest would be 
equal to the averages of those same properties for the members of the 
ensemble. 

This method of attempting to validate the classical statistics proved 
unsatisfactory when it became recognized that classical systems, except 
for the trivial case of a single degree of freedom, do not have the 
required ergodic character. It might be felt at first sight, however, 
that quantum mechanical systems would perhaps be found to have the 
necessary character, owing to the circumstance that quantum mechani- 
cal states correspond to finite regions in the phase space instead of 
merely to points, or to the circumstance that quantum considerations 
have in general a less deterministic character than classical ones. 
Nevertheless, in spite of feelings of disappointment with the outcome, f 
it is found that quantum mechanical systems also do not have the 
needed ergodic character and that the proposed method of vaJidatiug 
statistical mechanics is also unsatisfactory in the quantum case. 

To discuss this, let us consider a given system of interest for which 
we merely specify the energy with an uncertainty AJS?, related to the 
time M available for its measurement, by the Heisenberg expression 

ZLEfAi « h. (85.1) 

And let us consider the treatment of this system with the help of an 
appropriate microcanonical ensemble of systems, having a density 
matrix which can be described in the language corresponding to the 
energy eigensolutions for the system by 


PoKm “1 range E to E+SE), 

0 {Ef^ not in range E to E+8E), 


( 86 . 2 ) 


where we choose the range E to E+BE in such a way as to include the 
specified energy E^, and assume conditions such that 8E can be taken 
as large compared with the uncertainty AE in this energy, and yet at 
the same time small enough so that the ensemble can be regarded as 
appropriately representing a 85 ratem of energy Eq. We must now inquire 
into the possible existence of some quantum mechanical ergodic prin- 
ciple, which would make the probability at a random time, of finding 
our system of interest in a specified quantum mechanical state, the 


t Compare Schroedinger, Ann. der Phjfa. 83, 956 (1927). 



§85 VALIDITY OF THE QUAXTVM STATISTICS 359 

same as the probability of finding a system picked at random from 
the above ensemble in that state, or let us say in a practically identical 
state on account of the range in the possible energies E to E-j-SE. 

To investigate this we must consider the nature of the eigenfunctions 
u„(q), which describe the states of equal probability in the above 
ensemble, and which can also he used by superposition to describe any 
possible state for the system of interest. In accordance with their 
character, as normalized eigensolutions of the Schroedinger differential 
equation ^ 

these eigenfunctions for the case of a confined system of / degrees of 
freedom can be expected to have the form 

= ««(?! - ?/-. Cz ... C)), (85.4) 

where the index n is to be regarded as designating some particular set 
of values for the energy E and for the further/—! parameters C^ — Cf 
which determine the different possible eigenfunctions. In the case of 
the confined systems, ordinarily treated in statistical mechanics, in 
which the component particles are enclosed in walls or held together 
by their own forces, we can expect that suitable eigenfunctions «„(g) 
will be strictly possible only for discrete values of the energy E and 
the other parameters C^ — Cf. Nevertheless, for the situations that 
now interest us, we may take the eneigy range E to E-{-hE as lying 
high enough and the number of d^rees of freedom as large enough, so 
that the eigenvalues for the / quantities E, Q ... Cf can be treated as 
having nearly continuous spectra. The parameters C^...Cf are to be 
regarded as designating the values of physical properties of the system, 
in addition to its eneigy, which remain constant with time for a system 
in a given steady state. By adopting an appropriate set of eigen- 
functions, constructed with the help of linear combinations of any 
original set, we can relate some of these ‘constants of the motion’ in 
a simple manner to such well-recognized properties of the system as its 
components of linear and of angular momentum. In general, however, 
in the case of a system of many degrees of frreedom, most of these /— 1 
constants of the motion will have no simple familiar interpretation, as 
is also true for most of the constants of the motion that appear in the 
classical treatment of a complicated system. These remarks as to the 
nature of the eigenfrmctions, that inters us in statistical applications, 
are well illustrated — except for the extreme simplicity of the system — 
by the eigenfunctions for a particle in a box, as treated in the preceding 
chapter in §71. 



360 STATISTICAL ENSEMBLES IX THE QUANTUM MECHANICS Chap. IX 


With the help of this description of the nature of the eigenfunctions 
Wn(?)> ™ th® quantum mechanics that there could be 

no eigodie principle which would have the desired character. On the 
one hand, in accordance with the description of the microcanonical 
ensemble by 


Pnm — 


Po ^um 
0 


{E^ ia E to E-\-8E), 

{En not in i? to E+SE), 


(85.3) 


we should have equal probabilities of picking a system from the en- 
semble in any state n for which the energy lies in the range E to E-j-SE. 
On the other hand, in accordance with the possibility of describing any 
possible state of the actual system of interest by a summation, over 
states in this energy range, of the form 

0 = 2 (85.6) 

n 

we should have the probabilities |c.„|® for finding the system itself in 
the diflEerent states h. There is no reason, however, which would require 
equal magnitudes for the coefficients c„ in any particular ease; and we 
have seen above that the different states n, although all corresponding 
to approximately the same energy E, would be characterized by very 
different possibilities for the remaining/—! constants of motion ... C). 
Hence, also in the quantum mechanics, we cannot regard a micro- 
canonical ensemble as giving a necessarily perfect representation of a 
particular system of interest, but can only regard it as giving the best 
representation available in the absence of knowledge of the system 
except as to energy. 

It is interesting to note the similar reasons for the failure of a classi- 
cal or of a quantum system of / degrees of freedom to have ergodio 
character. In the classical mechanics the eigodic hypothesis has to be 
abandoned, as seen in § 25, since there would be 2/— 2 constants of the 
motion — ^in addition to energy and position along the trajectory — ^to 
which arbitrary values could be assigned. In the quantum mechanics 
the analogous h37pothesis has to be abandoned, as seen above, since 
there would stiU be /— 1 or half as many constants of the motion — ^in 
addition to energj’ — ^to which arbitrary values could be assigned. In 
both cases such an assignment could then lead to values of properties 
of the system of interest, which could not change with time, and which 
might be very different from the values for such properties predicted 
as probable from the corresponding microcanonical ensemble. 

It will, of course, again be appreciated, in the quantum as in the 
classical statistics, that the failure to find any exact ergodic principle 



§ 85 VALIDITY OF THE QUANTUM STATISTICS 361 

does not mean that we must usuallv expect great deviations between 
the properties of an actual system of interest and the corresponding 
representative ensemble. In the case of simple constants of the motion, 
such as the components of linear and angular momentum of the system 
as a whole, we should presumably regard ourselves as knowing the 
values of such quantities, and hence should make use of a somewhat 
more limited ensemble than the general microcanonieal one. And in 
the case of constants of the motion, having a very complicated physical 
significance, the limitation to particular values would often have but 
little effect on the distribution of values for the quantities of actual 
interest. It vill, of course, also again be appreciated, as in the classical 
case, that the assumption of ergodic character for the behaviour of 
quantum systems would actually prevent the proposed methods firom 
exhibiting their full statistical usefulness in allowing for the occasional 
occurrence of systems exhibiting quite unusual properties. 

It is of interest to realize in this latter connexion, however, that 
there would be a somewhat reduced possibility in the quantum 
mechanics for the conceptual construction of systems exhibiting excep- 
tional properties, since the number of constants of motion to which 
arbitrary values could be assigned is decreased by a half — as already 
noted above — on changing firom the classical to the quantum mechani- 
cal treatment of a system of / degrees of freedom. The reason for this 
decrease lies in the quantum mechanical elimination of the possibility 
for the simultaneous assignment of precise values to conjugate variables. 
Thus our classical example (§ 25) of a gas with aU its molecules per- 
manently moving parallel to the a:-axis between plane walls would not 
be possible in the quantum mechanics. This would be prevented by 
the impossibility of simultaneously specifying values for the y and z 
coordinates of molecules, which would place them on separate tracks, 
together with zero values for the conjugate momenta Py and jpg, which 
would secure a permanent motion in those tracks. Hence, speaking 
somewhat loosely, we can say that there are only half as many possi- 
bilities, in the quantum mechanics as compared with the classical 
mechanics, for systems to exhibit exceptional behaviour. 

We may best summarize this chapter, on the general nature of 
statistical methods in the quantum mechanics, by saying that the 
proposed methods provide the only possible non-arbitrary procedure 
that we can employ when it becomes necessary to treat systems in 
partially specified quantum mechanical states. 



X 


THE aiAXWELL-BOLTZMANN, EINSTEIN-BOSE, AND 
FERm-DIRAC DISTRIBUTIONS 

86. The microcanonical ensemble as representing a system in 

equilibrium 

We are now ready to commence the application of the methods of 
statistical quantum mechanics which were developed in the preceding 
chapter. As in the classical statistics, we shall usually be interested in 
applying these methods to idealized models for actual physical-chemical 
systems of common occurrence such as gases, liquids, solids, or en- 
closures filled with thermal radiation. These systems can be regarded 
as composed of large numbers of similar subsystems such as atoms, 
molecules, electrons, photons, modes of vibration, or other constituent 
elements. 

In the present chapter we shall study the equilibrium conditions for 
such systems, when the molecules or other elements composing the 
system can be taken as interacting only weakly with each other, so that 
the energy of the system as a whole can be regarded as practically equal 
to the sum of the energies of the individual elements treated as in- 
dependent of one another. As a typical example of such systems we 
may take a dilute gas, the degree of dilution being great enough so that 
the constituent molecules can be regarded for the most part as free 
rather than as engaged in the process of collision. Another example 
arises in studying the thermal properties of a solid under circumstances 
such that its thermal energy can be regarded as distributed among the 
different modes of elastic vibration of which the solid is capable; these 
modes of vibration then play the role of the weakly interacting elements 
out of which the system is taken as constructed. A similar example 
occurs in studying the distribution of radiant energy between the modes 
of electromagnetic vibration in a hollow enclosure. 

The treatment to be given to such situations in the present chapter 
will be based on the use of the microcanonical ensemble as providing 
an appropriate representative ensemble for a system m a steady con- 
dition of phenomenological equilibrium. The present study will thus 
be similar to the analogous classical treatment of equilibrium given in 
Chapter IV. Depending, however, on the quantum mechanical sym- 
metry properties of the elements composing the system, we shall now 
be led to three different possible types of distribution of such elements 



§ S6 CHOICE OF REPRESENTATIVE ENSEMBLE 363 


among their own individual states, rather than merely to the Maxwell- 
Boltzmann distribution as in the classical equilibrium of systems com- 
posed of weakly interacting elements. 

We must now give detailed consideration to the nature of the systems 
of interest and representative ensembles that will concern us in the 
present chapter. As in the corresponding classical studies of Chapter IV, 
we may regard the system of interest as being enclosed in a massive 
container of volume v, and may assume that the system cannot inter- 
change energy with the walls of the container but can adjust its linear 
and angular momentum by interaction with these (approximately) 
stationary walls. Referring for simplicity to axes at rest with respect 
to the system as a whole, we may then assume that our knowledge of 
the condition of the system is limited to a specification of its energy 
Eq, with an uncertainty AE which would be connected with the time 
available for observation by the Heisenberg uncertainty relation 

ASM « h. ( 86 . 1 ) 

As a consequence of this partial specification of the state of the 
system of interest we must evidently resort to the methods of statistical 
mechanics in order to study the expected properties of the system. 
For this purpose, in accordance with the discussions of the preceding 
chapter, we may now introduiie, as an appropriate representative en- 
semble for our system of interest, a microcanonical distribution of 
systems (§ 83 (6)) all having approximately the same energy as that 
specified for the system itself. As a specific expression for the density 
matrix, defining this ensemble in the energy language, we may take 


„ = / />o V in range ^ H+8E), . . 

\ 0 {El, not in range E to E+dE), ' ' 

where it will now be convenient to use Greek letters for the indices that 
designate the different energy eigenstates for the system as a whole, 
where we choose the range E to E-\-hE so as to include the specified 
energy Eq, and where we assume conditions such that SE can be taken 
as large compared with the uncertainty ^E m that energy, and yet at 
the same time as smaU enough so that it can be regarded &om a pheno- 
menological point of view as an infinitesimal quantity. 

In accordance with the known property of statistical equilibrium for 
microcanonical ensembles, we see — ^in the quantum mechanics as in 
the classical mechanics — ^that a system for which the energy alone is 
specified must be considered as exhibiting no preferential tendency for 
change in any particular direction. Hence, from the phenomenological 



364 THE THREE EQUILIBRIUM DISTRIBUTIONS Chap.X 

point of view, such a system may be regarded as in a steady state of 
equilibrium. Furthermore, in accordance with the character of the 
distribution described by (86.2), we see that the microcanonical en- 
semble appropriate for such a system would give equal probabilities for 
each of the different energy eigenstates /x. for which the corresponding 
eigenvalue of energy would lie in the specified range E to E-{-ZE, 

87. Specification of condition for a system composed of weakly 

interacting elements 

We shall next have to undertake a consideration of the different 
conditions, which will be of importance for our statistical considera- 
tions, in the case of a system composed of weakly interacting elements. 
In the present quantum treatment, as in the corresponding classical 
one of Chapter IV, we shall first specify these conditions by stating 
the distribution of the elements composing the system among the 
different elementary states which they can themselves assume, and shall 
then determine the number of states for the system as a whole which 
would correspond to each such specified condition. To understand 
these matters it will be desirable to begin with a general presentation 
of the interrelation between eigensolutions for the system as a whole 
and eigensolutions for its constituent elements. 

(a) Relation of eigensolutions for system to eigensolutions for its com- 
ponent elements. The different possible energy eigensolutions for the 
system as a whole will themselves have to be solutions of the time-free 
Schroedinger equation for the system, which was first discussed in § 60. 
For our present purposes it will be convenient to write this equation 
with the slightly altered notation 

^U^{q) = (87.1) 

where we now use a capital TJ with a Greek index to designate an 
eigensointion for the system as a whole, and reserve the small « with 
Latin indices for later use to designate eigensolutions for a single mole- 
cule or other kind of element out of which the system is composed. 
The symbol H in the above equation is the Hamiltonian operator for 
the system expressed in coordinate language, and is an allowed 
eigenvalue for the energy of the system, which will assure the appro- 
priate character as an e^ensolution for the function of the coordinates 
U^{q), where the sii^le letter $ is to be regarded as symbolizing the 
total collection of coordinates for all the molecules or other elements 
composing the system. 



§ 87 EIGENSOLtTTIONS FOR SYSTEM AND ITS ELEMENTS 365 


Let US now consider the circumstance that our present interest lies 
in systems composed of weakly interacting molecules or other constituent 
elements. If there are n such elements in all. it will then be possible 
to express the Hamiltonian operator H, corresponding to the energy 
of the whole system, in the form 


H = Hi+H8+H3+...-H„+V, (87.2) 

where ... are the Hamiltonian operators that correspond separately 
to the energies of the n constituent elements, taken as not acting on 
each other but as acted on by any general potential field — such as that 
of the walls — ^which may be present, and V is the operator that corre- 
sponds to the remaining energy, arising firom the actual presence of 
interaction. If this interaction is sufSciently weak, however, we can — 
for our present purposes — neglect the term V in comparison with the 
others and write jj ^ (87.3) 


as a suitable approximation. Substituting (87.3) into (87.1), and using 
the letters to symbolize the collection of coordinates for each 

of the n separate elements, we then obtain 


(Hi-l-H2-t-...-fH„)?:;(gi,g2,...,g„) = g3,...,g„) (87.4) 

as a new expression for the equation that must be satisfied by the 
different eigensolutions C^(?i,g 25 — >?«) for the system as a whole. It 
wiQ be appreciated as we proceed that we are thus assuming sufficiently 
weak interaction so that these unperturbed energy eigenstates can be 
taken as substantially the same as the true energy eigenstates that 
would be used in constructing a microcanonieal ensemble for the repre- 
sentation of equilibrium. 

The above equation may now be compared with the equations which 
would determine the eigensolutions — say «*{?!), «,(?«) — ^for the 

n separate elements which compose the system. These equations will 
evidently be of the forms 

H2’a/(g2) = 

(87.5) 


where H^, Hg,..., are the Hamiltonian operators for the n separate 
elements, expr^sed respectively in terms of the coordinates q^, q^,..., 
for the elements, and i?*, Ei,..., Eg are possible eigenvalues of the energy 



366 


THE THEEE EQUILIBRIUM DISTRIBUTIONS Chap.X 

respectively for the n elements. Comparing equation (87.4) with equa- 
tions (87.5), it then becomes easy to express any eigenfunction 
^(2i> jffn) for system as a whole in the terms of eigenfunctions 
«*(2») for its separate elements, provided we pay due 
attention to the symmetry restrictions that must be imposed on the 
eigensolutions for systems containing indistinguishable particles, as 
already discussed in § 76 (d). 

In the ease of a system composed of n distinguishable elements, no 
symmetry restrictions will have to be imposed, and it is evident that the 
energy eigensolviiom for the system will then be of the general form 


22>-. ?n) = %(2 iW22) - (87.6) 

where we must regard any change in the specified elements nos. 1 , 2 ,...,% 
assigned to the elementary eigenstates h, Z,..., s, or any change in the 
selection of eigenstates, as leading to a new eigensolntion for the system 
as a whole. The corresponding eigenvalues of energy will be given by 

= jEj,+Bi+...+E,. (87.7) 

In the case of a system composed of n indistinguishable particles of 
a character requiring symmeiiric solutions, the energy eigensolutions for 
the system will, with suitable normalization, evidently be of the general 
form __ , ^ 

22."., ff„) = I P«fc(?iK(22) - «s(2»), (87.8) 


where we take a sum over all permutations P of the particle indices 
1, 2,..., n, and must i^gard each different selection of elementary eigen- 
solutions Z,..., 8 as leading to a different eigensolntion for the 
system as a whole. The corresponding eigenvalues of energy will again 
be expressed by _ 

E^+E,+...+E,. (87.9) 


In the case of a system composed of n indistinguishable particles of 
a character requiring antisymmetric solutions^ the energy eigensolutions 
for the system will, again "with suitable normalization, evidently be of 
the general form 


^ (87.10) 


where we now take a sum over all permutations P of the particle indices 
1 , 2 ,..,, n using the negative or positive sign according as the permuta- 
tion is odd or even, and must regard each possible selection of ele- 
mentary eigensolutions Z,..., s as leading to a different eigensolution 



§87 


SPECIFICATION OF CONDITION 


367 


fi for the system as a whole. The corresponding eigenvalues of energy 
will once more be expressed by 

= E^^E,^...-rE,. (87.11) 

We thus obtain three typical examples of the ditferent possible ways 
in which the eigensolutions for a system as a whole can depend on the 
eigensolutions for its constituent elements. The first of these examples 
(87.6) would apply in the case of a system of n distinguishable elements, 
such as spatially located and oriented modes of oscillation, and will lead 
to a type of statistical result which can be appropriately designated as 
Maxwell- Boltzmann. The second example (87.8) applies in the case of 
a system of n indistinguishable particles subject to symmetric restrictions, 
such as nuclei and atoms composed of an even number of fundamental 
particles, or photons treated as particles, and will lead to the so-called 
Einstein-Bose tj'pe of statistical result. The third example (87.10) 
applies in the case of a system of n indistinguishable particles subject to 
antisymmetric restrictions, such as the fundamental material particles, 
electrons, protons, and neutrons, or nuclei and atoms composed of an 
odd number of such particles, and will lead to the so-called Fermi-Dirac 
type of statistical result. 

More complicated cases of systems composed of more than a single 
type of element can evidently be handled by simple extensions of the 
above treatment. 

(6) Method of specifying different conditions. We may now give 
detailed attention to the method of specifying the conditions of such 
systems which will be of importance for our statistical considerations. 
These specifications will depend on the nature of the ener^ spectra 
for the individual elements composing the system. 

We shall assume, for the time being for simplicity, that our ^tem 
of interest will be composed of n constituent elements of only a single 
kind, and shall take such elements as having a known spectrum of energy 
eigenvalues, the same for each individual element without reference to 
the possibility or impossibility of maintaining a distinction between 
them. Thus our n elements could be n oscillators, all having the same 
energy spectrum because of the same intrinsic firequency v, but distin- 
guishable from each other by spatial location or orientation. Or the 
n elements could be n particles, all having the same energy spectrum 
because of the same mass m and spin s, but not permanently distin- 
guishable from each other on account of their free motion inside a 
common container. 



368 THE THBEE EQEILIBRIUM DISTRIBUTIONS Chap. X 

To obtain an appropriate approximate description of the energy- 
spectrum for a single constituent element in the potential field applying 
to the system as a -whole, we shall regard the total possible range in 
energy c as di-vided up into a succession of small ranges e to e+Ae, 
each range being identified by an index k and being of a width to 
correspond to the approximate accuracy -with which we shall later wish 
to specify different conditions of the sj^stem. We shall then describe 
the spectrum by stating the number of energy eigenaolutions g^ which 
fall in each such range Ae^. In most cases the energy spectrum -will be 
nearly enough continuous so that this description can he regarded as 
given by an expression of the form 

g. =/{OAv (87.12) 

where f{€^) is a continuous function of the energy that locates the 
range. As an example of such expressions we have our pre-rious finding 
for particles as given by (71.16). 

With the help of such a description of the energy spectrum for a 
single constituent element we shall then specify the different conditiona 
of the system as a whole that concern us by stating the number of 
constitueni dements n^, which lie in each of the groups of g^ eigenstates, 
that correspond to our succession of ranges in energy Ae^. 

It will be noted, in the quantum treatment as in the analogous 
classical treatment, that our specifications of condition will not in 
general be sufficient to specify a precise state of the system, this being, of 
course, a characteristic feature of statistical mechanical considerations. 
It -will also be noted that our method of specifying different conditions 
does not itself have to make any direct reference to the three different 
types of relation which may actually be found for the dependence of 
the eigensolutions for the system on the eigensolutions for its elements. 

(c) Number of eigenstates corresponding to a specified condition of the 
system. We are now finally ready to consider the number of eigenstates 
for the system as a whole that would correspond to any particular 
condition of the system as specified by the method described above. 
The form of expression, describing the dependence of the n-umber of 
eigenstates G for the system on the number of elements n^ in each 
group of g^ elementary eigenstates, -will be different for different t 3 rpes 
of relation between eigensolutions for the system and eigensolutions 
for its elements. We shall hence give separate treatment to the three 
types of such relation described in the first part of this section. 

In the MeaweU-Bcdtztnann case, arising when the system is composed 



§87 


NUMBER OF EIGENSTATES 


369 


of n distinguishable elements, it is evident, from the relation between 
the two kinds of eigenstates given by (87.6), that each particular 
assignment of the elements to their possible elementary eigenstates 
would coiTespond to a different eigenstate for the system as a whole. 
To determine the consequences of this we note that our total collection 
of n elements could evidently be divided into quotas containing 
members in 

nl 

... 


different ways, and that the elements in any such quota could be 
assigned to different eigenstates in different ways. Hence we can 
at once take i 

^ = (87.13) 

1C 

in the Maxwell-Boltzmann case, as the desired expression for the num- 
ber of eigenstates G that would correspond to a condition of the system 
specified by the in the manner that we have described. 

In the Einstein- Bose case, arising when the system is composed of 
n indistinguishable particles and the eigensolutions must be symmetric, 
it is evident, from the relation between the two kinds of eigenstates 
given by (87,8), that each possible way of selecting occupied states for 
our n particles would correspond to a different eigenstate for the system 
as a whole, and that there is no restriction on taking a given eigenstate 
as occupied by more than a single particle. To determine the conse- 
quences of this, let us fix our attention on the particles assigned to 
a specified group of states, and let us consider the possibilities for 
making permutations in a linear array of n^+g ^^ — 1 objects, which can 
be regarded as the particles together with the 1 partitions which 
would be sufficient to separate the length available for the array into 
cells. Since the total number of permutations for the n^+g^--l 
objects would be equal to {n^+g^—l)\, and since the permutations of 
the particles among themselves, or of the partitions among themselves, 
would not be significant, we should then have a total of 


different ways of selecting occupied states from the group under con- 
sideration. Hence we can now evidently take 

3595.25 3 3 



370 


THE THREE EQUILIBRIUM DISTRIBUTIONS Chap.X 


in the Einstein-Bose case, as the desired expression for the number of 
eigenstates G that would correspond to a condition of the system 
specified by the n^. 

Finally, in the Femii-Dirac case, arising when the system is com- 
posed of n indistinguishable particles and the eigensolutions must be 
antis 3 Tnmetric, it is evident, from the relation between the two kinds 
of e^enstates given by (87.10), that each possible way of selecting 
occupied states for our n particles would correspond to a different 
eigenstate for the system as a whole, but that it would now be impos- 
sible to have more than a single particle in any individual state, since — 
in agreement with our previous discussion of the Pauli exclusion prin- 
ciple — ^we see from the form of (87.10) that this would make the corre- 
sponding eigensolution equal to zero. As a consequence it is evident 
that the number of particles assigned to any group of elementary 
eigenstates cannot now be greater than the number of states in the 
group. If the particles were distinguishable from one another they 
could evidently be assigned in 

iK 


different ways without puttii^ more than a single particle into a given 
elementary state. Hence, making due allowance for their actual indis- 
tinguishability, we can now take 


G 




gj 


(87.15) 


in the Fenni-Dirac case, as the desired expression for the number of 
eigenstates G that would correspond to a condition of the system 
specified by the n^. 

This must now complete our necessarily somewhat complicated dis- 
cussion of the specification of condition for quantum mechanical sys- 
tems composed of weakly interacting elements. 


88. The probabilities for different conditions of the system 
We may now combine the results of the two preceding sections. In 
the first of these sections, §86, it has been shown that a quantum 
mechanical system, m a steady state of phenomenological equilibrium 
with a specified energy, can be appropriately represented by a micro- 
canonical ensemble, and that such an ensemble gives equal probabilities 
for each of the eigenstates ft for which the eigenvalue of energy 



§ 88 PEOBABILITIES FOB DIFFERENT CONDITIONS 371 

would fall in a narrow range E to E-^hE selected as including the 
energy specified for the system. In the second of these sections. §87, 
we have then obtained — ^for the three possible types of systems com- 
posed of n weakly interacting elements — ^the expressions (S7. 13), (87.14), 
and (87.15) for the number G of such eigenstates y. that would corre- 
spond to a condition of the system specified by the numbers of elements 
assigned to the different groups of elementary eigenstates that 
interest us. Combining these results, we may now write the following 
expressions for the probabilities P of finding our system in a condition 


specified by the n^-. in the Maxwell- Boltzmann case 


? = const. 

(88.1) 

in the Eimtein-Bose case 


p - const. 

(88.2) 

and in the Fermi-Dirac case 


P = const. TT - 7 -^ — n» 

(88.3) 


where the factor denoted by ‘const.’ would in every case have a value 
independent of the particular assignment considered. The expres- 
sions apply, of course, to conditions such that the energy of the system 
would lie in the selected range E to E+8E, the probabilities for other 
conditions being zero. 

For our further purposes it will be more convenient to take the 
logarithms of these probabilities. Assuming for the tune being that 
the numbers n, gr^, and also in the Fermi-Dirac case, are all 

large compared with unity, we then obtain, with the help of the Stirling 
approximation for factorials (28.3), the approximate results, in the 
Maxwdl-Bolizmann case 

log P = n\ogn+ 2 {"xlogy*:— (88.4) 

K 

in the Einstein-Bose case 

log P = 2 ff-cloggJ+coJist.; (88.5) 

1C 

and in the Fermi-Dirac case 

log P = 2 {(nK-fl'K)log(?K-»K)-»«log »jc+?*log g*}+eonst. (88.6) 

#C 

It win be noted, except for the term nlogn in (88.4) which can be 



372 


THE THREE EQUILIBRIUM DISTRIBUTIONS Ghap.X 


included in the constant term when it is not necessary to consider 
variations in the number of elements n, that the three expressions 
approach identical form with gic’^n^, that is, with many states 
available in the different energy ranges for the elements assigned 
to them. As we shall see later, this will be true tmder conditions of 
high temperatine and great dilution where the three types of quantum 
mechanical statistical results come into correspondence principle agree- 
ment with the classical MaxweU-Boltzmann results. 


89. Condition of maximum probability. The three distribution 
laws 

With the help of the foregoing expressions for the probabilities of 
different conditions we can now determine the most probable condition, 
for a system of the kind that we are considering, by examiniog the 
effect of varying the assignment of elements to the different groups 
of g^ states, keeping the total number of elements n constant (except 
in the special case of photons), and keeping the total energy E constant 
since all our conditions must correspond to an energy restricted to a 
narrow range E to E-\-SE. Making use of (88.4-6), this is seen to 
lead to the following variational equations for determining the dedred 
conditional maximum of log P and hence of P itself, where the three 
equations in the first group apply respectively to the MaxweU-Boltz- 
mann, Einstein-Bose, and Eermi-Dirac cases: 

K 

-8 log P = 0 = 2 {log»«-log(%„+grj} (89.1) 

2 {log »i*-log(9,c-’^/c)} S’t*, 

8» = 2 = 0> (89.2) 

K 

8P = 2 = 0. (89.3) 

#C 

In settiiag up the last of these equations we assume the ordinary case 
of a nearly continuous energy spectrum for the elements composing the 
system. This makes it possible to have a large number of states as 
assumed above, in an energy range to €^+ which is harrow enough 
to be treated as differential. Combining the above equations for the 
conditional mckximum of log P, using the Lagrange method of undeter- 



§89 


DERIVATION 


373 


mined multipliers, we then obtain for the three cases respectively 

2 = 0 , 

K 

2 <«*•*> 

where a and jS are the undetermined constants. 

Since the variations Sn^ can now be treated as arbitrary, these equa- 
tions can only be satisfied when the coefiScients of the Sn^ are in- 
dividually equal to zero, and solving for the this then leads for the 
three cases respectively to the desired expressions; in the Maxicdl- 
Boltzmann case 


in the Einstein-Bose case 


and in the Fermi-Dirac case 


11 

( 89 . 5 ) 


( 89 . 6 ) 

3k 

( 89 . 7 ) 


Depending on the type of quantum mechanical system considered, 
we thus obtain the three different distribution laws which describe the 
equilibrium distribution of the constituent elements of a system over 
their own individual states. When convenient, the three distribution 
laws may be expressed in the combined form 


= 




(89.8) 


where we take 0, —1, or +1 according as we are interested in the 
Maxwell-Boltzmann, Einstein-Bose, or Fermi-Dirac ease. 

The derivations which we have given for these relations have been 
obtained with the help of the Stirling approximation for factorials, and 
this may be regarded to some extent as a defect, especially in the 
Fermi-Dirac case where we have had to assume large compared 

with unity and yet shall actually wish to make applications to cases 
(conduction electrons) where the number of states and number of 
particles would be nearly equal. Nevertheless, we shall be able to 
show later, in §§113 and 114, that these approximate expressions for 



374 


THE THREE EQUILIBRIUM DISTRIBUTIONS Chap.X 


the most probable numbers of elements in different states in a micro- 
canonical ensemble can be taken as the exact expressions for the mean 
numbers of elements in different states in a canonical or in a grand 
canonical ensemble. 

In developing the derivations we have assumed, in accordance with 
(89.2), that the total number of elements n was to be kept constant 
in carrying out the variations which determine the most probable con- 
dition. In our later application of the Einstein-Bose result to photons 
treated as particles (see § 93 (a)) we shall wish to remove this restriction, 
since there is no conservation law for the number of photons in a 
system. It will be seen from the method by which equations (89.4) 
have been obtained that this can be accomplished by setting the un- 
determined constant a equal to zero in the final result (89.6). 

It will be noted, in agreement with the remarks made at the end of 
§ 88, that all three distribution laws approach the same form when 
Sk ^ since would then be large compared with unity. 


90. Distribution in systems containing constituent elements of 
more than one kind 


For simplicity the foregoing treatments of the Maxwell-Boltzmann, 
Einstein-Bose, and Fermi-Dirac distributions were carried out assuming 
systems composed of only a single kind of constituent element. It is 
immediately evident, how^ever, that similar methods could be applied 
to systems composed of large numbers of different kinds of weakly 
interacting elements, say n of one kind, m of another kind, and so forth. 

In such a case any condition of the system of interest could be 
specified by giving the numbers of elements mx, etc., that fall 
respectively in the different groups of elementary eigenstates gx, 
etc., that could be set up for each kind of constituent element. The 
probability P for any such a condition would now evidently be given 


by a product 


p = PiP, 


(90.1) 


where the factors ij, P,, etc., would be respectively associated with the 
elements of the different kinds n, m, etc., in number, and would have 
one or another of the forms (88.1-3), according to the particular 
synnnetry character for the kind of element involved. And the condi- 
tion of maximum probability would now be determined by a S 3 rstem 
of simultaneous variational equations of the form 





i 90 


ELEMENTS OF MORE THAN ONE KIND 

= 2 8 «« = 0 , 

K 

hti = 2 \ = 0 , 


(90.3) 


SE = 2 2 ^ASmA-... = 0. (90.4) 

« A 

Combining these equations by the method of undeteniiined multipliers, 
we should then obtain an expression of the form 



g log Pi 





glogPj 

gjHA 



-r... = 0, 


(90.5) 


where a„, oc^, etc,, and p would now be undetermined multipliers. Since 
the variations 871 ,,, 8m)^, etc., could then be treated as independent, this 
would necessitate the value zero for the coefficients of Sn,,, Sm\, etc., 
in the above equation. 

Hence, comparing (90.5) with the earlier equations (89.4) appearing 
at the analogous stage of our previous treatment of equilibrium in 
systems containing only a single kind of constituent element, it is now 
seen that we shall be led in the general case for each kind of constituent 
element to a distribution law of the general form 


n = ^ 

* e“+^**±l(0)’ 


(90.6) 


where we take the Maxwell-Soltzmann, Einstein-Bose, or Fermi-Dirac 
specific form, depending on the character of the elements considered. 
It will also be noted, as can be concluded from an examination of the 
steps leading to (90.5), that the constant p in expressions of the form 
(90.6) wotdd have the same value for all the different kinds of con- 
stituent element out of which our total system is constructed. This 
character of the constant )3, as a parameter applying at equilibrium to 
all constituents of the system, is of importance in connexion with the 
relation of )3 to the temperatme T of the ^stem, which will be obtained 
in the next section. 


91. Evaluation of constants in the distribution laws 
We must now consider the values of the two constants a and j 8 , which 
were introduced as arbitrary multipliers in the course of the foregoing 



376 


THE THREE EQUILIBRIUM DISTRIBUTIONS Chap.X 


derivations, and which still appear in the jSnal expressions for the three 
forms of distribution law 


— , 

^ e»+^^^±l(0) 


(91.1) 


These constants are to be regarded as adjustable parameters, whose 
values depend on the number and kind of elements composing the 
system, and on the equilibrium condition considered. 

(a) Value of constant a. A formal expression for evaluating the con- 
stant a, for any assigned value of j8, can be immediately obtained by 
summing (91.1) over all the different groups of elementary eigen- 
states which have been set up. In this way we must obtain the total 
number of elements n in the system, and hence can write the equality 


n = y — (91.2) 

In principle this equation could then be solved for ot, assuming j3 known, 
and assuming the information as to the character of the energy spectrum 
of the elements needed for the determination of the and c,,. 

In the MaxweU-Boltzmann case, or in the other two cases under 
conditions such that the added term ±1 in the denominator becomes 
negligible, we can at once obtain from equations (91.2) the explicit 
solution 

= (91-3) 


which is the quantum analogue of our previous classical equation (32.3) 
for evaluating the equivalent constant. In general, however, in the 
Einstein-Bose and Fermi-Dirac cases, no simple explicit solutions for 
the constant cx are available, and special methods of treatment have 
to be devised as will be discussed later. 

In any case it will be noted that a, as was true for the equivalent 
classical constant, depends not only on the structure of the system but 
also on the particular value assigned to j8, and hence as we shall see on 
the temperature of the system. 

(6) Value of constant jS. In order to proceed to an understanding of 
the other constant )?, it will now be profitable, as in the analogous 
classical development of § 32 (6), to take a portion of our system as 
consisting of a dilute monatomic gas, composed of n particles, of mass 
m, and let us say for specificity particles not having spin, all enclosed 
in a container of volume v. Our system is thus provided with an 



§ 91 EVALUATION OF CONSTANTS a AND ^ 377 

adjunct, which can be made to serve the functions of a classical perfect 
gas thermometer, and to lead to an interpretation of jS. 

Making use of the result (71.16) already calculated for the number 
of eigensolutions for a particle in a container, we may mite 

ff/f = -p- Ae,, (91.4) 

as an expression for the number of elementary eigenstates that would 
be available to the particles, composing our gas, in any small energy 
range Ae^. Substituting this into the distribution law (91.1). and 
changing for convenience to a differential form of expression, we can 
then mite . , 

(91.5) 

for the number of particles in our thermometer Tvliieh would have 
energies in the range de, where we are to take the positive or negative 
sign in the denominator depending on the spnmetrj^ character of the 
particles. Integrating over all possible values of the energy, we also 
obtain an expression of the form 

n == — 7?i\ (2m) (91.6) 

0 

for the total number of particles of the gas. 

We are now in a position to understand the possibility of using such 
a container full of gas as a classical perfect gas thermometer providing 
an evaluation of j8. For this purpose, with any given number of particles 
n, we have only to choose the volume of the container v and the mass 
of the particles m large enough. This will have the following conse- 
quences. In the first place, by taking the mass of the particles m 
suflBciently large, it will be possible — ^in accordance with the corre- 
spondence principle — ^to regard the particles as obeying the classical 
mechanics. In the second place, by taking the volume v sufficiently 
large, it will then be possible to regard the gas as perfect and hence 
as satisfying the consequences derived by the classical statistical 
mechanics for perfect gases. In the third place, by taking v and m 
sufficiently large, it will be seen from the form of (91.6), for any given 
number of particles that the parameter e® can be made so great as 
to be very large compared with unity; this will then permit us to 
rewrite the distribution law (91.5) in the form 

dn = d€ — const. (91 •'7) 

fti 


3595.25 


30 



378 


THE THREE EQUILIBRIUM DISTRIBUTIONS Chap.X 


which now agrees with our previous classical expression (31.3) applied 
to the distribution of the particles of a perfect gas. In the classical 
case, however, we have already shown in § 32 (6) that the parameter 
jS in such a distribution expression would be connected with the tem- 
perature T of the perfect gas by the relation j8 = IJlcT, where k is 
Boltzmann’s constant. Hence, since the quantity j8 introduced in the 
quantum treatment was shown' in § 90 to have the same value for all 
constituents of the system, and has now been shown equal to the 
classical jS for a constituent of the system so chosen as to obey the 
classical statistics, we can now write, also in the quantum statistics, 



as a connexion between the statistical mechanical parameter j8 and the 
phenomenological temperature of the system T as measured on a per- 
fect gas thermometer. 

As in the classical statistics, we may regard this interpretation of 
JS in terms of T as providing increased insight into the character of 
the statistical results that we obtain. It may also again be remarked 
that we shall later present in §§ 122 and 131 what may seem to be a 
more fundamental method of introducing the thermodynamic notion 
of temperature into statistical mechanical considerations. 

92. Maxwell -Boltzmann systems 

Although it is the primary purpose of this book to consider the 
principles of statistical mechanics rather than their applications, we 
may now conclude the present chapter with sections giving a partial 
account of the methods employed, and results obtained, in the three 
cases of systems exhibiting the Maxwell-Boltzmann, Einstein-Bose, and 
Fermi-Dirac types of statistical result. We shall begin by considering 
systems composed of weakly interacting elements that are permanently 
distinguishable from each other, and hence systems having eigensolu- 
tions not subject to symmetry restrictions and having the Maxwell- 
Boltzmann type of equilibrium distribution. 

(a) Mean energy of oscillators of frequency v. The Maxwell-Boltz- 
mann systems, of major interest from the point of view of applications, 
may be regarded as consisting of sets of harmonic oscillators all having 
the same frequency r, but distinguishable from each other by permanent 
spatial location or orientation. It will be desirable to obtain an expres- 
sion for the mean energy of such a set of oscillators under equilibrium 
conditions. 



§92 


MEAN ENERGY OF OSCILLATORS 


379 


In accordance with the quantum mechanical properties of such oscil- 
lators, as illustrated by the treatment given in § 72 to a particle in 
a Hooke’s law field of force, we can take the energy spectrum for our 
oscillators as expressed by 

with 2, 3,.... 

And hence for the number of eigenstates in the energy range 


= JivXk^ 


we can take 


Qk = 


(92.2) 


Substituting in the Maxwell-Boltzmann form of the distribution law 
we can then write 


Ik, (92.3) 

as an expression for the number of oscillators at equilibrium which 
would have values of the quantum number k falling in the range Aic. 

With the help of this distribution law we can then evidently calculate 
the mean energy of such oscillators by the expression 


2 e-»-^*+*>*'’(/c+i)ArAic 

: , (92.4) 


2e-““^*+*^’’AK 

ic=0 

where we now use a single bar to denote a mean value for the oscillators 
composing the system of interest. To evaluate this expression we may 
choose equal values for the different rangra Ak, and then cancel the 
factors Ak, e““, and from all terms in the numerator and de- 


nominator. This then leads to 


2 

i = hv^ 

2 


hv 

¥ 




hv 

Y 


or finally to 


hv , hv 
="e^p_l‘T2'’ 

. hv 

® - e»Wfc!r_i 


hv 

Y’ 


(92.5) 


where the next to the last form is obtained by performing the indicated 



380 


THE THEEE EQUILIBRIUM DISTRIBUTIONS Chap.X 


division, and the last form is obtained by substituting for jS its known 
expression (91.8) in terms of the temperature T of the system. We thus 
obtain the desired expression in the quantum statistical mechanics for 
the mean energy e to be ascribed to oscillators of frequency v in a 
system which has come to equilibrium at temperature T. 

(6) Application to the specific heat of solids. The use of the above 
result, in obtaining an xmderstanding of the specific heat of solids has 
been very important. Empirically, the heat capacity of a crystal com- 
posed of n atoms of an elementary substance is known to drop from 


the classical value 


^4 = 


(92.6) 


as given by Dulong and Petit, towards zero as we go to sufficiently 
low temperatures. 

The general nature of the explanation for this phenomenon was first 
given by Einstein, f who appreciated that the 3» oscillators correspond- 
ing to the n atoms of a crystalline solid should be assigned the mean 
energy which they would have on the basis of the quantum theory,^ 
rather than the classical value ZnTcT. In carrying this out he assumed 
somewhat too schematically that the same jBrequency v could be used 
for all 2n oscillators, but obtained nevertheless an expression for heat 
capacity which had the qualitatively correct behaviour of dropping 
from the classical value Znh to zero. 


The more complete theory for the specific heats of crystalline solids 
given by Debyejl regards the thermal energy of a solid as distributed 
among the various modes of elastic vibration of different frequencies 
of which the solid is capable. We shall give further consideration to 
this theory in Chapter XIV, when we come to a consideration of the 
thermodynamic properties of crystals. See § 137. 

(c) Application to radiation. The foregoing result, as to the mean 
energy of oscillators at equilibrium, also has an important application' 
in connexion with the equilibrium distribution of radiation in a hollow 
enclosure which has come to thermal equilibrium. For this purpose we 
may regard the radiant energy in such an enclosure as resident in the 
various modes of electromagnetic vibration which the enclosure would 
present. 

For the number of such modes of vibration inside a hoUow enclosure 


t Einstein, Ann, der Phys, 22, 180 (1907). 

% For this purpose Einstein used only the first term of (92.6), since the term Av/2 was 
not given By the older quantum theory. This makes no (Terence, however, after 
differentiation by T to obtain heat capacities. 

II Debye, Ann, der Phys, 39, 789 (1912). See also Bom and von Kiim&a, Phys. 
Zeiia, 13, 297 (1912) ; and further articles in volumes 14 and 15. 



§92 


APPLICATION TO CRYSTALS AND RADIATION 


381 


of volume v, and lying in the frequency range v to we may take 

the known expression o n 

(92.7) 


iZ = 


- dv. 


And for the mean energy of a mode of vibration of frequency v in a 
system at temperature T, we may take the expression (92.5) 

Jiv , hv 


6 = 


ghrJkT j * 2 


(92.8) 


as obtained for harmonic oscillators of frequency y. 

In combining these two expressions in the case of radiant energy 
we shall arbitrarily drop the term — known to be appropriate in 
mechanical considerations — since its retention would lead in the present 
case to an infinite density of electromagnetic energy even in empty 
space at the absolute zero of temperature. Combining the two expres- 
sions with this elimination, and dividing by the volume v, we are then 
led to the well-known Planck radiation law 


du = 


Swiv® 1 
c» 


(92.9) 


which gives the density of radiant energy in the range v to v-\-dv in 
a hollow enclosure which has come to thermal equilibrium at tempera- 
ture T. It will be noted from the form of (92.9) that, if we should 
let h go to zero, we should then have, in accordance with the corre- 
spondence principle, the Rayleigh-Jeans law dn = Sm^/c® . kTdv which 
corresponds to a classical treatment of the modes of oscillation in the 
enclosure. 

The above provides a simple and formally satisfactory derivation of 
the Planck radiation law which is very instructive. A more elaborate 
treatment of the problem would necessitate a development of the 
quantum mechanical theory of fields, which at the present time can 
hardly be regarded as leading to a deeper insight into the physical 
nature of the problem. 


93. Einstein-Bose systems 

We may now turn our attention to systems of weakly interacting 
constituent elements which exhibit the Einstein-Bose type of equili- 
brium distribution. This will be the case for systems of particles which 
cannot be permanently distinguished the one from another, and which 
exhibit eigensolutions that must be symmetric in character. 

The general treatment of Einstein-Bose distributions is mathe- 
matically somewhat complicated, owing to the circumstance — already 



382 


THE THREE EQUILIBRIUM DISTRIBUTIONS Chap.X 


noted in § 91 (o) — ^that simple explicit solutions for the constant a are 
not ordinarily available. For this reason we shall first consider the 
mathematically simple case of photons where this difficulty does not 
arise. 

(o) Application to radiation. The treatment of the electromagnetic 
energy in a hollow enclosure, which has come to thermal equilibrium, 
as an Einstein-Bose system in which photons are regarded as taking 
the place of more ordinary particles, has proved quite interesting. Such 
a treatment was the original purpose of the statistics introduced by 
Bosef before the development of the quantum mechanics. To make 
such an application is formally at least quite simple. 

For the number of photons in the elementary eigenstates corre- 
sponding to an energy range to we may take 


STk 

eP«*— 1’ 


(93.1) 


in agreement with our assumption that photons are to be treated as 
particles obeying the Einstein-Bose distribution law (89.6), and in 
agreement with the consideration already mentioned in § 89, that the 
constant a is to be taken as zero in applying this equation to photons, 
since their total number in an isolated system does not have to be 
regarded as conserved. Taking the energy of a photon as related to 
the corresponding frequency v by 

6 = hv, (93.2) 

substituting j8 = IjkT, (93.3) 

and changing to a differential mode of expression, the equation given 
by (93.1) may then be rewritten in the more convenient form 

where dn is now to be taken as the number of photons that would be 
present at equilibrium in the dg elementary eigenstates that would be in 
the fiequency range v to v+dv. 

To make use of this expression we must now make some assumption 
as to the number of eigenstates for a photon that would lie in a given 
fiequency range. For this purpose we may regard the relation between 
photons and electromagnetic waves as roughly similar to the relation 
between ordinary particles and probability waves* On this basis we 

t Bose, Zeita.f. Phtfs. 27, 384 (1924). 



§93 


APPLICATION TO PHOTONS 


383 


may then associate each eigenstate for a photon in an enclosure with 
a corresponding steady state solution of the electromagnetic wave equa- 
tion, just as we have previously associated each eigenstate for a particle 
in a container with a steady state solution of the Schroedinger wave 
equation. Doing so, we can then put the number of elementary eigen- 
states dg, in range v to v\-dv, for a photon in an enclosure of volume v, 
equal to the known number of modes of electromagnetic vibration in 
such an enclosure: 

dg=^^dv. (93.5) 


Combining (93.4) with (93.5), multiplying by the energy per photon 
hv, and dividing by v so as to obtain the energy density, we are then 
led once more to the Planck radiation law 


du = 


SirhiP 1 
c® 


(93.6) 


for the density of radiant energy in the range v to v-^dv, in a hollow 
enclosure, which has come to thermal equilibrium at temperature T. 
Here it may be noted that we should have been led to the Wien law 
dv if we had treated the photons as statistically inde- 
pendent and applied the Maxwell-Boltzmaim rather than the Einstein- 
Hose distribution law, since the ( — 1) in the denominator would then 
have been missing. 

We are thus able to give a derivation of the Planck law, either from 
the point of view of the preceding section treating radiation as electro- 
magnetic waves, or from the point of view of the present section treating 
radiation as photons, provided in each case we make appropriate 
quantum theoretic generalizations of the classical treatment. Thus, 
also in the case of dectromagnetic as well as in the case of mechanical 
phenomena, we encounter the wave-particle duality which plays so 
fundamental a role in quantum theory. The existence of a formal 
equivalence between results obtained from oscillators obe 3 nng Maxwell- 
Boltzmann relations and particles obeying Einstein-Boae relations has 
been specially emphasized by Dirac. In conclusion, it is to be remarked 
that both of the above derivations for the Planck law may appear at 
the present stage of theory to be somewhat arbitrarily guided by the 
known character of the result to be obtained and the heuristic applica- 
tion of the correspondence principle. 

(6) Useful integrals in the Einstein-Bose case. It will now be of 
interest to give some consideration to the mathematical methods that 



384 


THE THREE EQUILIBRIUM DISTRIBUTIONS Chap.X 


can be employed in the general treatment of Einstein-Bose systenos of 
particles, when simple explicit solutions for the parameter a, in the 
distribution law 


9k 




(93.7) 


are not available. 

Eor this purpose it proves convenient to consider integrals of the 
general form oo 


UM = 


1 r zPdz 

r(p+l) J 


(93.8) 


0 


where a is the parameter mentioned, and /> is a number which in om 
applications wiU be either i or f. We shall only have to consider 
positive values of a, since with a negative it is evident that the distribu- 
tion law (93.7) could lead to negative values of the numbers of particles 
in eigenstates of sufficiently low energy With a positive, integrals 
of the form (93.8) can be treated by carrying out the indicated division 
of zP by (e“+®— 1) in the integrand. This then gives a series of terms 
which can be handled by known formulae of integration (see Appendix 
II). In this way we readily obtain 




g-2a 


g-Sa 

3P+1 


e“4“ 


(93.9) 


which wiU converge rapidly when a > 1. 

(c) Values of parameter a, energy E, and pressure p. We may now 
make use of such integrals in investigating the values of the parameter 
a, energy E, and pressure p for gases composed of particles exhibiting 
Einstein-Bose distributions at equilibrium. 

For this purpose we may begin by writing, in accordance with our 
previous treatment of § 71 (6), 

9k = ^ Ae,, (93.10) 

as an explicit expression for the number of elementary eigensolutions 
for a particle of mass m, in a container of volume v, in the energy range 
€* to Cj^+Ae*. A factor g has now been explicitly introduced in this 
egression to allow for different posribilities for the intrinsic angular 
momentum of the particles considered. It is equal to the number of 
eigenstates of angular momentum that would be possible with each 
eigenstate of kinetic energy e. Substituting (93.10) in the Einstein- 
Bose distribution law (93.7), and changing to a differential mode of 



385 


§93 


EINSTEIX-BOSE GASES 


expression, we then have 

= (93.11) 

for the number of particles in the energy range e to e-^-de. 

Integrating (93.11) over all possible values of so as to obtain the 
total number of particles n, substituting /3 = l/tT, and putting 
= r(J+l), we can then readily obtain the result 


nh^ 


vg^ltmkT)^ r(i 4- 1 ) 


00 

(93.13) 


in terms of an integral of the type introduced above. This equation 
shows that the value of the parameter ee will depend on the nature and 
condition of the gas, as expressed by a combination of the quantities 
n, V, m, T appearing at the left. For this important combination of 
quantities it will be convenient to introduce the symbol 

nA» 


y=tr(«.J) = 


vg{2vmkT)*’ 


(93.13) 


It will be seen from the above equations that large values of a will 
correspond to small values of y, or in more physical terms to small 
values of the concentration njv combined with large values for the mass 
of the particles m and for the temperature of the system T. 

Multiplying (93.11) by e and integrating over all valuffi, we obtain 
for the total (kinetic) eriergy of the gas 

f = (93.14) 


and again substituting /3 = IjkT, this can be re-expressed in the form 

00 

B- _ 1 f Z»d2 ,Q9JK\ 

E - ikT ^ J (93.15) 

0 

Hence, noting (93.8) and (93.12), we can also express the energy of the 
gas in a form rj, .v 

E = \nkT^^ (93.16) 


which depends on integrals of the type introduced above. 

Finally, in order to investigate the pressure of such a gas, we note 
on the one hand, in accordance with our original derivation of the 
eigenftmctions for a peuticle in a container, see (71.10), that the energy 

3595.35 3 2) 



386 


THE THREE EQUILIBRIUM DISTRIBUTIONS Chap. X 


for such eigenfunctions would be of the form 

where and are integral quantum numbers, and Z^, Zg, and Zg 

are the linear dimensions of the assumed rectangular container. On the 
other hand, it can be shown, see § 97, that a particle, started off in any 
eigenstate, as defined by the above quantum numbers, would remain 
in that same state during any sufficiently slow adiabatic change in the 
dimensions of the container. Hence, noting the above inverse depen- 
dence on the square of the linear dimensions, and considering the 
average effect for different states, we can take 

n const. 

(93.18) 


in the case of a gas containing many such particles, as an expression for 
the dependence of energy on volume when reversible adiabatic changes 
in the volume are to be considered. In accordance with the definition 
of pressure in terms of the work done in a reversible adiabatic expan- 
sion, we can then take 


_ 2 const. E 
dv 3 »*• Z V 


(93.19) 


as an expression — ^in the quantum as well as in the classical mechanics 
— ^for the dependence of the pressure p of a dilute gas on its energy E 
and volume v, under equilibiixun conditions. Combining (93.19) with 
(93.16), we then obtain 


_ n1cTV{(x,%) 
V V{a,\) 


(93.20) 


as the desired expression for the pressuTB of an Sinstein-Sose gas in 
terms of integrals of the type that we have introduced. 

(d) Case of slight degeneration. We may now investigate the values 
of a, E, and p for an Einstein-Bose gas under conditions where a is 
large enough so that the series expansion (93.9), for the integrals that 
we have introduced, will converge rapidly. ISiese may be called condi- 
tions of slight degeneration since the properties of the gas will then 
approach those for a classical perfect gas. 

Under such conditions, with the help of equations (93.9) and (93.13), 
we can then write as a series expansion for the quantity y, which con- 
veniently characterizes the condition of the gas. 


y *= U(a,i) = 


vg{27mkT)* 


o-2ae 


o-3a 


2# 3# 


(93.21) 



§93 EIXSTEIX-BOSE GASES 3S7 

whicli can be readily solved to give 

I (93-22) 

Furthermore, we can also make use of the series expansion (93.9) to 
rewrite our expression for total energy (93.16) in the form 

E = + (93.23) 

which, by substituting (93.22), leads to the result 

E = fni2'(l-0-1768y-0-0033ir=-...). (93.24) 

For the pressure of the gas we then obtain 

p = ~ (l-0-1768y-0-0033y*-...). (93.25) 

In accordance with these results, noting the dependence of y on nji', 
m, and T, we see that the properties of our Einstein-Bose gas would 
approach those for an ordinary perfect gas as we increase the dilution, 
particle mass, and temperature of the gas. As we approach this limit, 
however, the energy and pressure would always be less than those for 
a perfect gas. 

Xevertheless, these theoretical deviations are not important imder 
ordinary circumstances. For example, in the case of molecular hydro- 
gen, which should act as an Einstein-Bose gas, the value of y imder 
standard conditions of pressure and temperature would be only of the 
order 10~®. Hence the slight degeneration present under such circum- 
stances would be entirely too small to detect. At very low temperatures 
and high densities the deviations of Einstein-Bose gases from the classi- 
cal behaviour of perfect gases would of course become important but 
would then be difficult to distinguish from deviations due to strong 
instead of weak particle interaetion.t At the present time the applica- 
tion of the Einstein-Bose results to photons is the most important one 
that we have. 

t For the treatment of highly degenerate Einstein-Bose gases, it is not always legiti- 
mate to replace sums by integrals as we have done (see Uhlenbeck, Over staitsiiAche 
methoden in de Theorie der Quanta^ 's-Grsvenhage, 1927, p. 70). For deviations 
between classical and quantum behawur in the case of transport phenomena, see 
Uehling and Uhlenbeck, Phys, Rev, 43, 552 (1933); Eehling, Phys, Rev, 46, 917 (1934); 
Massey and Sfohr, Proc, Roy, Soe, A, 141, 434 (1933), A, 144, 188 (1933). 



388 


THE THREE EQUILIBRIUM DISTRIBUTIONS Chap.X 


94. Fermi-Dirac systems 

We may now turn to the consideration of systems of weakly inter- 
actiug constituent elements which exhibit the Fermi-Dirac type of 
eqtdHbrium distribution. This will be the case for systems of particles 
which cannot be permanently distinguished one from another, and 
which exhibit eigensolutions that must be antwymmetnc in character. 

(a) Useful integrals in the Fermi-Dirac case. Explicit solutions for 
the parameter a appearing in the Fermi-Dirac distribution law 


n = 


(94.1) 


are in general not available. And it now proves convenient to make 
the general treatment depend on integrals of the general form 

0 

where a is the parameter mentioned, and p is a number which in our 
applications will be either J or f . 

These integrals are similar in form to the previous integrals U{a,p) 
introduced for the treatment of Einstein-Bose systems, but have -{-1 
instead of — 1 in the denominator of the integrand. In evaluating these 
integrals it will not now be possible to confine the consideration to 
positive values of a, since the distribution given by (94.1) does not now 
become physically impossible for negative values of a. 

When ce is positive the above integral can easily be treated by per- 
forming the indicated division by the term (e“+®-|-l). This then gives 
a series of terms which can be handled by known formulae of integra- 
tion (see Appendix II). In this way we readily obtain 


V(a,p) = e-“- 


e-scc 

2 P+i 


gScc g-4a ^ 
3/>+i 4P+i“f' *" 


(94.3) 


which will converge rapidly if a ^ 1. 

When a is negative the forgoing series is no longer useful. An 
approximate treatment by Sommerfeldf then leads to the result 

= (-^y^^^ fi I 2 f (P+l)pCa (p+l)p(p-l)(p-2)c, 

r(p-i-2)L ^ I 

where the constants C 4 , etc., have the values 

* C =1-14-1-14. 

^ 4v ' '*•* 

t Sommeifeld, ZeUs,/. Phys, 47, 1 (1928). 



(94.5) 



§94 


FEKMI-DIKAC GASES 


389 


This series converges rapidly when — The appro 3 ±tnation em- 
ployed in obtaining it involves an error of the order of e*. 

(6) Values of parameter a, energy E, and pressure p. We may now 
make use of such integrals in investigating the values of the parameter 
a, energy E, and pressure p for gases composed of particles exhibiting 
Fermi-Dirac distributions at equilibrium. For this purpose we may 
begin by re-expressing our distribution law (94.1) in the more con- 
venient differential form 


dn = 


4^vg 


myj{2m) 


e* de 


(94.6) 


where dn denotes the number of particles in the energy range e to 
e-j-dc. The method of obtaining this form of expression is similar to 
that given in detail for the analogous expression (93.1 1) in the Einstein- 
Bose development and the quantities involved have a similar signifi- 
cance. 

Integrating (94.6) over all possible values, so as to obtain the 
total number of particles n, substituting jS = IjkT, and putting 
= r(^-l-l), we can then readily obtain the result 


nA* — ^ f _ n il 

BS'(27rtni^ r(4-}-l) J 


(94.7) 


in terms of an integral of the type that we have introduced. This 
equation shows also in the Fprmi-Dirac case that the value of the 
parameter a will depend on the nature and condition of the gas, as 
expressed by a combination of quantities, which we can again denote 
by the symbol 


y = F(a,4) = -- 


nh^ 


vg{2irmkT)i' 


(94.8) 


It will again be seen from the present two equations that large values 
of a will correspond also in the Fermi-Dirac case to small values of y, 
or in more physical terms to small values of the concentration njv 
combined with large values for the mass of the particles m, and for the 
temperature of the system T. 

Multiplying (94.6) by e, and integrating over all values, we now 
obtain for the total (kinetic) energy of the gas 



00 



(94.9) 



390 


THE THREE EQUILIBRIUM DISTRIBUTIONS Chap.X 


and again substituting jS = IjhTt this can be rewritten in the form 


ET _ 3 1 f 

A* r(|+l)J e“+*+l' 


(94.10) 


(94.11) 


Hence, noting (94.2) and (94.7), we can also express the enejgy of the 
gas in a form y, sv 

which depends on integrals of the type that we have introduced. 

Finally, in order to obtain an expression for the pressure of the gas, 
it is evident that we can again use the relation (93.19), 


2E 
^ = 3o' 


(94.12) 


and hence write 


nhTy{pi,i) 
^ v F(a,i)' 


(94.13) 


(c) Case of slight degeneration. We may now investigate the values 
of a, E, and p for a Fermi-Dirac gas under conditions where a. is large 
enough so that the first of the two series expansions (94.3), for the 
integrals V{a,p), will converge rapidly. These may be called conditions 
of slight degeneration since the properties of the gas will then approach 
those for a classical perfect gas. 

Under such conditions, with the help of equations (94.3) and (94.8), 
we can then write as a series expanrion for the quantity y, which con- 
veniently characterizes the condition of the gas. 


T7/ T\ 

y - F(Q:,i) - 


g-2a ^ g-aoc 
2< 3» 


(94.14) 


which can readily be solved to give 


e-“ = 2,jl-hly+(i-iy-{-...|. (94.16) 

iPurthermore, we can now also make izse of the series expansions 
(94.3) to rewrite our expression for total kinetic energy (94.11) in the 

form , ! g-2oe g-Soc \ 

which, by substituting (94.15) and computing the numerical coefficients, 
leads *to the result 


E = |nfc3’(l-l-0-1768y— 0*0033y2-i-...}. 


(94.17) 



§94 


FERMI-DIRAC GASES 


For the pressure of the gas, in agreement with (94.12), we shall then 
obtain „i.rp 

p = -^(1^0-1768i/-00033«/2-f ...). (94.18) 

These results, except for signs, are similar to those for the Einstein- 
Bose ease. In accordance therewith, noting the dependence of y on 
nfv, m, and T, we see that the properties of our Fenni-Dirac gas would 
approach those for an ordinary perfect gas as we increase the dilution, 
particle mass, and temperature of the gas. As we approach this limit, 
however, the energy and pressure of the Fermi-IMrac gas would always 
be higher than those for a perfect gas. Nevertheless, for particles of 
the mass of ordinary atoms at familiar concentrations and tempera- 
tures, the deviations from perfect gas behaviour would lie well below 
detection. (See remarks in § 93 (d) concerning Einstein-Bose gases.) 

Only in the case of stellar iateriors,t and in the case of the conduction 
electrons in a metal, which can be treated in jBrst approximation as an 
electron gas, do we find a large enough concentration or small enough 
particle mass for the new theory to become important. We then find, 
however, even at the high temperatures of stellar interiors and up to 
ordinary temperatures for metals, that the gas can be in a condition 
of extreme degeneration. 

(d) Case of extremedegeneration. To treat the caseof extreme degenera- 
tion in Fermi-Dirac gases (i.e. a large and negative) we must make use of 
expansions of the form (94.4). Doing so, equation (94.8) then becomes 

77 / 

y ^(®>4) T.m\* 


vg{2innkT)^ 


4(-a)»ri , j(i)(i)c, , 


(|)(i)(-i)(-|)C4 


Introducing for e, the exact value 


. 1 + 1 - 1 + 

2*^32 42 ^”' 12 ’ 


(94.19) 


(94.20) 


this can then be solved to give for oe, to the indicated order of approxi- 

Furthermore, again using expansions of the form (94.4), we find that 
the expression for the total energy (94.11) now becomes, to the same 
order of approximation, 

E = |„&T(_a)|l-)-^,...). (94.22) 

f See Fowler, MoiUUy yoiica, 87, 114 (1926-7). 



392 


THE THEEE EQUILIBRIUM DISTRIBUTIONS CJhap.X 


which, with the help of our values for a and y, finally reduces to 

E = (94.23) 

10m\4iTi?p/ 2A® \Sn I 

For the pressure of the gas, in agreement with (94.13), we shall then 
obtam _n I ^ n vhny^l4fiTvg '^^^^ (94.24) 


P 


V 5m\4fiTvg) '^v 3A® \ 


In accordance with these results we see in the case of extreme de- 
generation that a Fenni-Dirao gas would have a residual zero-point 
energy and pressure even at the absolute zero of temperature. It would 
also have a heat capacity at low temperatures 


which would be proportional to the temperature, and would be small 
per particle in the case of large densities njv and small particle mass m. 

(e) Remarks on applications to conduction electrons. It will not be 
out of place to mahe a few remarks concerning the really great advances 
in our understanding of the properties of metallic conductors which 
have resulted £x>m the foregoing development of quantum statistics. 

To e 2 q>lain the properties of such substances, we may regard the 
outer valence electrons, provided by the atoms of a metal, as behaving 
in first approximation as a firee electron gas, with the repulsive forces 
between them neutralized on the average by the positive ions which 
then compose the lattice. This picture of metallic structure thus agrees 
in a general way with that employed by Drude in the classical theory 
of metallic conduction, and leads to similar explanations for the non- 
electrolytic character of metallic conduction, for the high thermal and 
dectrical conductivities exhibited by metals, and for the Wiedemann- 
Franz law as to the ratio between these conductivities. The picture 
cannot agree in all details with the classical one, however, since we 
now know that electrons, being subject to the Pauli exclusion principle, 
would have to exhibit the properties of a Fermi-Dirac gas rather than 
those of an ordinary perfect gas as assumed by Drude. 

To determine the importance of this change in point of view, we 
must obtain an idea as to the values of the important quantity 


^ vg{^kT)^ 

which are to be expected in the case of t 3 pioal metals. We may cal- 
culate this quantity for Ihe case of silver at room temperature, assuming 



COXDrCTIOX ELECTRONS 


393 


§ 94 


one free electron per atom which gives us an electron density of approxi- 
mately 5*9 X 10®® per cubic centimetre. Taking <7 = 2 corresponding 
to the two directions of electron spin, together with the approxi- 
mate values njv = o-Q X 10®® cm.~®, h = 6*5 x 10“®® gm. cm.® sec.“*, 
m = 9-0xl0-®8 gm., k = l'4xl0“®® gm. cm.® sec.“® deg.“®, and 
T = 300 deg., we then obtain for the above quantity the large value 

y = 2,200. (94.27) 

And, substituting in (94.21), this then gives us the large negative value 

a = —205 (94.28) 

for the parameter which determines the degree of degeneration of our 
gas. On account of high electron concentration and low electron mass, 
large negative values for a are thus found in general for typical metals 
even up to considerable temperatures. We must hence conclude that 
the Fermi-Dirac gas, by which we represent the conduction electrons 
of a metal, will be in a condition of extreme degeneration, with pro- 
perties quite different from those of an ordinary perfect gas. 

To get a good qualitative idea of the properties of such a gas, it will 
be helpful to note that there would be a great tendency under condi- 
tions of extreme d^eneration for most of the electrons to accumulate 
in the eigenstates of lowest energy, filling each such state with the one 
electron allowed by the Pauli exclusion principle. The reason for this 
tendency can be seen with the help of our previous equation (94.1) 


»,= — ^ 


1 ’ 


(94.29) 


cormectiDg the number of particles in a given energy range to 
number of states in that range. With a large and 
negative, this is found to make the number of electrons senidbly 
equal to the number of states y„ available, for all energies from zero 
up to a value high enough to include nearly the total number of 
electrons. 

This finding, that most of the electrons would reside in the practically 
completely filled eigenstates of lowest energy, provides an immediate 
solution for the major difficulty of the classcal Drude theory as to 
the nfgligihlft contribution actually made by conduction electrons to 
the total specific heat of a metal. This n^ligible contribution to total 
specific heat is now explained by the consideration that most of the 
electrons cannot be hfted to adjoining states of higher energy by rise 
in temperature since these states are already filled. Only ihe electrons 

3S95.35 3 E 



384 


THE THREE EQUILIBRIUM DISTRIBUTIONS Chap.X 


in the layer between filled and empty states can be so affected. Quanti- 
tatively this latter is seen to be unimportant, however, since our 
previous formula (94.25) for the heat capacity C„ of a degenerate B’ermi- 
Dirac gas, applied to the case of the conduction electrons in silver at 
room temperature, gives the small value 

C„ = 0-023raifc (94.30) 

instead of the classically expected value 

C„ = §nL (94.31) 

Our new idea, as to the existence of only a small firaction of the 
conduction electrons in the layer between filled and empty states, also 
provides the basis for Pauli’s brilliant treatment of the magnetic pro- 
perties of metals-t With the discovery that the electron has an intrinsic 
magnetic moment for itself, it might be supposed at first sight that the 
presence of firee conduction electrons would make metals highly para- 
magnetic, since the electrons acting as elementary magnets could all 
orient themselves in the direction of any applied magnetic field. Prom 
the point of view of our present theory it is evident, however, that 
only the electrons in the boundary layer would be provided with unfilled 
states of nearly the same energy which could be occupied after the 
reorientation. This consideration, together with an allowance for the 
diamagnetic eflFect resulting according to the quantum theory from 
moving charges, has been found to give a good account of the magnetic 
susceptibilities of the alkali metals. 

Our present point of view also provides the basis for Sommerfeld’s 
fairly complete tibeory of electrical and thermal conductivities and of 
thermo-electric effects, together with such further elaborations as those 
of Houston, Bloch, Peierls, Nordheim, Brillouin, and Wilson.^: It also 
provides the basis for the treatments of electron emission from metals 
as carried out by Fowler, Nordheim, Wentzel, and others. It has, 
however, provided no immediate understanding for the still unezqtlained 
phenomena of super-conductivity at low temperatures. 

t Pauli, ZeUs.f* Phys, 41, 81 (1927). 

J See ^mmerfeld and Bethe, Sandbitdi der Physik, xxiv/2, second edition, Berlin, 



XI 


THE CHANGE IN QUANTUil ilECHANICAL SYSTEMS 

WITH TDEE 

95. Dynamical reversibility in the quantum mechanics 

In the preceding chapter we have considered possibilities of applying 
statistical mechanics to quantum systems which are in a steady condi- 
tion of phenomenological equilibrium. In this chapter we may now 
commence the treatment of systems which are not in a steady condition. 
In the present section we shall first consider the status of dynamical 
reversibility in the quantum mechanics. In the next two sections, §§ 96 
and 97, we shall then undertake the mathematical problem of obtaining 
integrals for Schroedinger’s equation for the change in state of a system 
with time, both in the case of isolation and in the case when changes 
are made in some external parameter for the qrstem such as its volume. 
In § 98 we shall then turn to more physical questions by studying the 
nature of the approximate observations by which in practice it might 
actually be profitable to follow the changes in a quantum mechanical 
system with time. In ^ 99 and 100, with the help of representative 
ensembles to correspond to approximate observations on the nearly 
steady states of a system, we shall then study time-proportional transi- 
tions in general, and the special case of transitions by molecular collision 
taking place with a probability proportional to the time. Einally, in 
§ 101 we shall consider the general case of changes with time in an 
ensemble of isolated members. Just as our consideration of the effect 
of molecular collisions in changing the state of classical ^sterns made 
it possible to proceed to a study of Boltzmann’s H-theorem, so the 
considerations of the present chapter will then make it possible to study 
the natural analogue of that theorem in the quantum mechanics. 

We may begin by investigating the possibility of dynamical reversi- 
bility in the quantum mechanics, that is, in the case of any quantum 
mechanical system which is changing with the time, the possibility 
of specifying the conditions for a second system of entirely similar 
structure which would be changing with the time m the reverse manner. 
The question is of considerable importance for statistical mechanics and 
thermodynamics, since it is evident that the actual phenomenolc^cal 
irreversibilities of macroscopic behaviour would assume quite a different 
aspect firom that in classical theory, if the principle of dynamical 
reversibility had to be abandoned in the quantum mechanics. 



396 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 

We must first define what we can reasonably mean in the quantum 
mechanics by two systems which are changing with time in the reverse 
manner. As in the corresponding classical treatment of Chapter V, we 
may here regard ourselves as concerned for simplicity with isolated 
conservative systems, since we take it as possible to look at any system 
from a point of view which would make this true. Let us denote the 
first system by the letter /S, and — in accordance with the limited pos- 
sibilities of prediction provided by the quantum mechanics — ^take 
TF(j, t)dq and W{Pit)dp as giving the probabilities at time t of finding 
the coordinates and momenta for this system in the ranges dq and dp, 
and as the expectation value at time t for any function of 

these coordinates and momenta. The second system may then be 
denoted by the letter S' and analogous symbols t) dq, W'{p, t) dp, 

and F'{q,p,t) be taken for the probabilities and expectation values for 
this second system of similar structure. Using this notation, we shall 
then say that the behaviour of the second system is the reverse of that 
of the first if it satisfies the conditions 

W'{q,t)^W{q,^t), 

W'{p,t)=^W{-p,-t), (95.1) 

t) = I{q, -p, —t). 

In accordance with these equations, the second system would then 
exhibit at time f the same probability for specified values of the co- 
ordinates, the same probabiliiy for specified values of the momenta 
taken with reversed sign, and the same expectation value for any func- 
tion of the coordinates and reversed momenta, as would be exhibited 
by the first system at time —t. These conditions are thus the natural 
analogue of the classical conditions for the reversal of motion, in which 
the two systems would exhibit precisely the same values of their 
coordinates, reversed momenta, and functions thereof at times t and 
—t respectively. 

To show the actual possibility of such reversed behaviour in the 
quantum mechanics we must consider Schroedinger’s equation which 
governs the change of quantum mechanical systems with time. In 
applying this equation we may restrict our considerations for simplicity 
to isolated, conservative systems and take the Hamiltonian operator 
H, for the system under investigation, as obtainable from the classical 
expression S{q,p) for the energy in the simple TnanTier 

A a 



§95 


DYNAMICAL BEVERSIBILITY 


397 


where the classical expression involves only even powers of the momenta. 
This will make the operator H ‘real’, so that it will be unaffected in 
passing from an equation in which this operator occurs to the corre- 
sponding complex conjugate equation. 

For the first of our two systems S we may then write Schroedinger’s 
equation (57.5) in the form 

t)+^ = 0, (95.2) 

and the corresponding complex conjugate equation, with the help of 
the last of the above assumptions, in the simple form 

H^«-*(?, t)-^ = 0. (95.3) 

The expressions ^(g, t) and 0*(g, t) occurring in the above equations are 
to be regarded as giving, for the first of the two systems S, the actual 
dependence of the probability amplitude and its complex conjugate on 
the coordinates q and time t. 

For the second of our two systems S' we shall then consider the 
possibility of taking the probability amplitude t) and its complex 
conjugate ^'*(q, t) as determined by the equations 

= and ^•*{q,t)=^m,-t). (95.4) 

For this to be a possible specification it is evident that these new 
quantities must also satisfy Schroedinger’s equation and its complex 
conjugate equation 

Hf{g.f)+A|^'(j.t) = o 

and Hf *(g,t)- A |^'*(g,f) = 0. 

Substituting, however, from (95.4), the first of these equations can be 
written in the form 

and the second in the form 

which are at once seen to be vaKd by comparison with (96.2) and (96.3). 

We have thus found that (96.4) does give a possible specification for 
the probability amplitude and its complex conjugate for the second 
system We shall now show that this specification satisfies the three 



398 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Gbap.XI 

requirements given by (95.1) for a behaviour of system 8' which is the 
reverse of that of the first system 8. 

MaTring use of the relation (67.1) between probability density and 
probability amplitude 

W(q,t) = 

we see that the specification given by (96.4) immediately satisfies the 
firat of the three requirements for reverse behaviour 

W'{q,t)=Wiq,-f). (96.6) 

This shows that the second system would exhibit at time f the same 
probability for specified values of the coordinates q as the first system 
exhibits at time —t. 

Furthermore, malring use of the transformation relation (57.2) be- 
tween probability amplitudes for coordinates and momenta, we evi- 
dently obtain, with the help of (95.4), 

f(p,t) = h-y j 

= h-if j 4s*(q, dq 

= -t), 

and sitnilarly —t)- 

This then gives at once the next of our three requirements 

W’(p,t)=W{-p,~t), (96.6) 

which shows that the second ^stem would exhibit at time t the same 
probability for specified momenta taken with reversed sign as the first 
system exhibits at time —t. 

Finally, making use of the expression (67.4), which makes it possible 
to calculate expectation values J* from a knowledge of the correspond- 
ing operator ^j, we obtain, with the help of (96.4), 

r'tej, 0 - J f(j. 0 f(!. ^ <) 

= /«»■ -#4 -')]*'*« 

= F{q,-p,-t), 


(96.7) 



§95 


DYXAMICAL BETERSIBILITY 


399 


where the fourth form of writing is justified in accordance with (55.25) 
by the Hermitian character of the operator F. This result then gives 
the last of our three requirements for reversed behaviour by showing 
that the second system would exhibit at time i the same expectation 
value for any function of the coordinates and momenta as exhibited 
for the same function of coordinates and reversed momenta by the first 
system at time —t. 

We thus find that the principle of dynamical reversibility would hold 
in the quantum mechanics in much the same way as in the classical 
mechanics. Hence the introduction of the quantum mechanics — at 
least in its present form — caimot be regarded as throwing any new kind 
of light on the problem of the actual phenomenological irreversibility 
of thermodjmamic processes. Just as in the classical mechanics, this 
irreversibility will have to be explained by considering the prpbable 
behaviour of a collection or ensemble of systems rather than from con- 
sideration of the purely mechanical behaviour of a single i^rstem. 

96. Integration of Schroedinger equation for changes with time 

in an isolated system 

(a) Introduction. Before proceeding to more physical considerations, 
it will first be profitable to make a preliminary disposition of purely 
mathematical problems connected with the integration of Schroedinger’s 
differential equation for the change in the state of a quantum mechani- 
cal system with time. In the present section we shall consider isolated 
systems, and in the next section systems where some external para- 
meter such as volume is itself changed with time. 

In the case of an isolated quwtum mechanical system it is of cour% 
evident that we could in any case write Schroedinger’s equation for 
the dependence of smy state on the time in the quite general 

integrated form ^ ^ 

where the Cj. are constant coefficient whose squares give the 
probabilities for the different true energy eigenstates k for the ^{rstem 
corresponding to the possible eigensolutions %(?) eigenvalues of 
energy Ej^. Such an expansion in terms of true ener^ solutions is not 
usually convenient, however, for treating the temporal behaviour of 
a system, since we are not ordinarily interested in the fact that there 
would be no change with time in the relative probabilities for the truly 
steady states of an isolated 8;^tem — ^whose dose determination would 
indeed take an exceedingly long time — ^but are, on the other hand. 



400 CHANGE IN QUANTTBI MECHANICAL SYSTEMS WITH TIME Chap. XI 

interested in the actual change with time in the probabilities for states 
of the system which could be closely determined by observations taking 
a length of time short compared with that needed for appreciable 
change. 

For this reason we shall now begin by considering the integration 
of Schroedinger’s equation, by the method of variation of constants, 
when the state of the system is expanded in terms of nearly steady 
states on which observations might be made. We shall then follow 
by considering a general method of integration when the state of the 
system is expanded in terms of any kind of eigenstates that we might 
wish to observe. 

(6) Expansion of state in terms of unperturbed energy eigenstates and 
integration by the method of variation of constants. The integration of 
the Schroedinger equation by the method of variation of constants has 
already been discussed in § 68, and it will now merely be necessary to 
recite certain results in a form available for use in the present chapter. 
To apply the method we consider the Hamiltonian operator H for the 
system to be expressed as the sum of two terms 

H = H»+V, (96.1) 

where the impertwrbed Hamiltonian operaior H** corresponds to a nearly 
precise expression for the energy of the system, and the perturbation 
operator V to the remainder necessary for an exact expression of the 
energy. The unperturbed Hamiltonian may then be used to determine 
a set of normalized orthogonal eigenfunctions %($) for the system with 
the help of the characteristic equations 

H%(3) = E%uM> (96.2) 

where the and u^ig) may be called the tmperturbed energy eigen- 
values and unperturbed energy eigenfunctions for the system. Since 
these eigenfunctions are taken as providing a complete set, the proba- 
biliiy amplitude for the state of a system could then be expressed as 
a function of time f by a summation of the form 

(96.3) 

where the probability coefficients which would be constants if we 
had made an expansion in terms of true energy eigenfunctions, are 
now allowed to vary in the manner actually demanded by the quantum 
mechanics. As an actually appropriate expression for the time depen- 
dence of these probability coefficients, we have obtained (68.6) the 



§96 


INTEGRATION OF SCHROEDINGER EQUATION 


401 


specialized form of the Schroedinger equation 

where the V , are the elements of an Hermitian matrix defined hr 


with 


y,.k = J «J{s)VMfc(9) dq, 
'ynk=ytn- 


(96.5) 

(96.6) 


The differential equations (96.4) for change with time will in general 
form an infini te set and contain so far no approximation. By solving 
them, we could in principle determine for any case of interest the values 
of the coefficients as a function of time. This would then give us 
the probabfflty ^ „ 

of finding the system at time i in any state of interest n, provided we 
normalize to unity with 

I Wi(t) = I cj(f)ct(t) = 1. (96.8) 


Although the precise solution of equations (96.4) would be com- 
plicated, we can easily obtain an approximate integration by assuming 
that the separation of the true Hamiltonian into two parts as given by 
(96.1) has actually been carried out in such a way as to make the effect 
of the perturbation operator V really small. Under these circumstances 
the matrix components TJ,j. in (96.4) will be small and the coefficients 
c„(f) will be changing but slowly with the time. Hence we can make 
an approximate integration over a short time interval, in the neigh- 
bourhood of < = 0, by assigning to the coefficients Cj^(t) on the right- 
hand side of (96.4) the values C|.(0) which they have at time t = 0. 
Doing so we then obtain in the neighbourhood of £ = 0 the simple result 

JEJJ) t 

Cn(t) = c„(0)-f 2 (96.9) 


as can readily be verified by rediffeientiation. By resubstituting this 
result into (96.4), we could then obtain a higher stage of approximation, 
and so on to any desired accuracy. We shall leave the consideration 
of the exact integration of Sohroedinger’s equation, however, for the 
following very general treatment, where we consider the expansion of 
state in terms of any kind of eigenfunctions that may be desdred. 

A specially simple case of interest arises when the system is known 
at time f = 0 to be in a particular state k with 

F*(0) = cf(0)ct(0) = l, 

3F 


SS95.SS 


(96.10) 



402 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 


and with, all other coefficients c„(0) equal to zero. Tor any state n not 
the same as h, equation (96.9) now reduces to 


— 




_ 


E%-El 


c*(0). 


(96.11) 


Noting (96.10), the probability of finding the system in state n at any 
time in the neighbourhood of i = 0 then becomes 




2-e 


2in 




— e 


{El-EW 


or by a simple transformation to trigonometric form 


^^1{EI-E%)t 

(^0^- 

In accordance with this result we see, on account of the appearance 
of the so-caUed resonance denominator {E^—E%f, that the probability 
of finding the system in state n will continue to increase with the time t 
the longer, the closer the unperturbed energy E^ of the system is to 
the original unperturbed energy E%. Exact equality of E^ with E% is 
not necessary for a transition to take place, since the above are only 
unperturbed rather than true energy levels for the actual system. 
Starting the system off at time i = 0 in the state h would not be 
equivalent to putting it in a steady state of definite energy from which 
no transitions could occur, but would correspond to a distribution over 
various possible values of the true energy with a relatively high proba- 
bility of finding values dose to if a measurement were made. 
This relatively high probability of finding values of the energy in the 
neighbourhood of E% will then be preserved in time in accordance with 
the quantum analogue of the principle of the conservation of energy 
as discussed in § 63 (c). 

With the hdp of (96.12) we can also calculate, at times near i = 0, 
the probability of still finding the system in the onginal state Te, since 
the fact that the total probability of - findi-ng -the system in one or 
another state must remain equal to unity will at once permit us to write 


W,(^) = 1-42|F^1«-A-^. (96.13) 

This result can also be verified by direct calcTilation starting jfrom the 



§ 96 METHOD OF VABIATIOX OF COXSTAXTS 403 

differential equations (96.4), provided we carry the computation to 
the nest stage of approximation beyond that employed above. 

This provides an account of the approximate integration of the 
Schroedinger equation, by the method of variation of constants, which 
will be sufficient for the purposes of the present chapter. The results 
obtained are expressed in terms of the probability coefficients cjt) for 
a set of unperturbed energy eigenstates of the system and are only 
valid provided the unpertiurbed Hamiltonian which determines these 
eigenstates is sufficiently close to the true Hamiltonian H for the 
system, and provided the time interval over which the integration is 
taken is not too long. The results are of special interest because of the 
possibility for a simple understanding of the character of the unper- 
turbed energy eigenstates, or nearly steady states, which furnish the 
quantum mechanical language employed, and because of their later 
application in studying — ^with the help of statistical considerations — 
transitions which occur with a probability that is proportional to the 
time between groups of neighbouring nearly steady states. 

(c) E^ansion of state in terms of general eigenfunctions and integra- 
tion as a Taylor’s series in the time. We may now turn to a con- 
sideration of the more general problem of obtaining a formally exact 
integration for the Schroedinger equation expressed in the generalized 
language corresponding to any set of eigenfunctions for the system. 
This will give us a very powerful and actually simple mathematical 
formalism for treating the changes in a quantum mechanical system 
with time, but nevertheless a formalism which may not seem very 
transparent from the 2 >oiat of view of an intuitive appreciation of the 
character of the results. 

In accordance with our discussion of the quantum mechanical trans- 
formation theory in § 67, we may give a very general expression for the 
quantum mechanical state of a system in the form 

= (96.14) 

where the are any desired complete set of normalized orthogonal 
eigenfbmctions for the system, and the UjJf) may be called the general- 
ized probability amplitudes for the different eigenstates k. Making use 
of this formalism, the content of Schroedinger’s equation can then be 
expressed, in accordance with (67.13), in the general form 

k 


(96.16) 



404 CHAiTGE UI QUAOTLIU: MECHAOTCAL SYSTEMS TVTTH TIME Chap. XI 


where the elemente of the Hennitian mataix are related to the 
Hamiltonian operator H for the system by the expression 

= (96.16) 

with = (96.17) 

In order to obtain an integrated expression for the generalized equa- 
tion (96.15), it win oidy be necessary to consider the values of the 
successive derivatives of the probability amplitude o„(#) with respect 
to i at some luitial time f = 0, and then use these to give a Taylor’s 
expansion for the amplitude ajjt) at any desired later time t = t. For 
the derivative at i = 0 we can write at once from (96.16) 


For the second derivative we can then write 


/<=o ^ ^ 


= (96.19) 

Jc 

where the first two forms of writing are made possible by the successive 
application of equation (96.15), and the last form by the rules for 
matrix multiplication. Proceeding in this manner, we should then evi- 
dently obtain as a general expression for the .^th derivative 



(96.20) 


Slaking use of expansion in the form of a Taylor’s series, we can 
then write for the probability amplitude o„(i) at any desired time t the 
expression 


®«(«) = «»(0)+(-^’#) 2 2 J?^%(0) + 

k k 


bhns obtaining the desired integration of the generalized Sohroedinger 



§ 96 IXTEGRATION AS A TAYLOR'S SERIES 405 

This result, moreover, can be written in quite a compact form if we 
introduce an exponential operator which we regard as equi- 

valent to the series expansion 

-^t\ (96.22) 

** — V 

Considering the matrix elements corresponding to the successive terms 
of this series, we then see that the integrated Schroedinger equation 
(96.21) can be re-expressed in the form 

®«(0 = I ajt(O). (96.23) 

Furthermore, we may give an immediate demonstration of the satis- 
factory character of this solution of Schroedinger’s equation, since we 
obtain by ledifferentiation 

k,l 

= (96.24) 

where the second form of writing is made possible by the commutativity 
of the operator H with any function of itself, the third form of writing 
comes from the rules for matrix multiplication, and the last form of 
writing comes from a further application of the original formula (96.23). 
It win be noted that this last form does agree with the generalized 
Schroedinger equation (96.13). 

(d) Change with time regarded as a unitary transformation. It will 
be convenient to introduce the symbol 

U(*) = (96.25) 

to denote the important operator defined by (96.22). Our solution of 
the generalized Schroedinger equation (96.23) can then be written in the 
very simple form ^ U^{t)at{0). (96.26) 

The consequences of going from time t = 0 to time t = t can now be 
conveniently spoken of as the result of a transformation with the help 


g-ii’rilKiHt _ 2 A/ 

V — 



406 CHANGE IX QL’AXTUM 3IECHAXICAL SYSTEMS WITH TIME COiap. XI 


of the matrix elements corresponding to the transformation 

operator U(0. We now proceed to consider the properties of such a 
transformation. 

We may first show the possibility^ of obtaining the inverse, operator 
to U(i) which will permit us to calculate the earlier probability ampli- 
tudes aj(0) in terms of the later ones ajp). In accordance with the 
equation of definition (96.25) we can evidently write for any two times 

#1 and U(«i)U(g = U(«i+g. (96.27) 

and hence in particular may write 

U(-<)U(«) = U(0) = I, (96.28) 

where I is the identity operator, with the corresponding matrix elements 

= 2 (96.29) 

n 

Multipl 3 nng (96.26) by C{„(— 0 and summing over n, we then obtain 

2 = imni-tWnkmiO) 

71 k n 


which gives 


— 2 

k 


(96.30) 


We thus see that U(— i), with the corresponding matrix elements 
is the desired inverse operator. 

We may next obtain a simple relation connecting the matrix elements 
corresponding to the two operators U(i) and U( — t). Making use of the 
series expansion (96.22), we can write as a matrix element correspond- 
ing to U(f) « 

Undt) = 2 J ^***"’“* 

aiid hence by the ordinary rules can write for its complex conjugate 

um = 2 J (96-32) 

Similarly, we can evidently write as a matrix element corresponding 
^ ^( “ 1 / o^AN r 

Ukni-t) = 2 J ^ 

t For certain general restrictions implied by the existence of the inverse operator see 
the discussion of Pauli, Haruibuch derPh^sik, xxiv/1, second edition, Berlin, 1933, p. 142. 



§96 TBEAT3IEJJT AS A UXITARY TRAXSFORMATIOX 407 

where the second form of expression is made possible by the Hermitian 
character of the operator H*''. Comparing (96.32) and (96.33), we then 
obtain the desired relation connecting the two kinds of matrix elements 

ClJ-f) = (96.34) 

showing incidentally that the operator U(/) is not Hermitian in 
character. 

KnaUy, with the help of (96.34), we may now express an important 
summation property for the matrix elements corresixmding to U(/) by 
rewiiting (96.29) in the form 

I = V (96.35) 

li 

Similarly, we can obtain 

I (96.36) 

In accordance with these summation properties, and with the exist- 
ence of the inverse operator as demonstrated above, we may now 
describe the transformation brought about by the operator U(t) as 
iiaving unitary character. It is to be emphasized, however, that this 
unitary transformation, to the same quantum mechanical language at 
a later time, is not to be confused with omr previous possibility, see 
§ 67 (e), for a unitary transformation to a different quantum mechanical 
language at a given time. 

(e) Application to the calculation of probabilities as a function of time. 
We may now apply the foregoing apparatus to the important problem 
of calculating the probability Tl^(f) of finding a system at time t = t 
in a given state n from a specification of its initial state at an earlier 
time i = 0. 

In accordance with our fundamental equation (96.26) we can write 
°»(0 = |^.fc«*(0). (96.37) 

where ajjt) is the probabilily amplitude at time f f for the final state 
n, and the 0 ^.( 0 ) are the various probability amplitudes at time f = 0 
for the initial states h, and where we shall now often find it convenient 
to write Ujii in place of the more explicit U^{t). Similarly, we can write 

<(0 = (0) (96.38) 

for the complex conjugate amplitude. Hence, multiplying (96.37) by 
(96.38), we now obtain for the probability of interest 

W^{t) = a*(t)a„{t) = ^ (0)a*(0). 


(96.39) 



408 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 


Or smumiiiig separately for the cases I = 1c and I can also write 

this in the form 

= 2 2 C^*jCr„,.a?(0)a;,(0). (96.40) 

k 

The general character, precise validity, and formal simplicity of the 
results obtained by the present methods may well be emphasized. In 
this connexion we note the simple form of the fundamental expression 
(96.26) obtained in the present treatment, 

= I ??«i.a*(0), (96.41) 

for the precise connexion between probability amplitudes, at times 
t = 0 and i = i, in the general language corresponding to any selection 
of quantum mechanical eigenstates. This may be contrasted with the 
considerably more complicated form of the analogous expression (96.9) 
obtained in the preceding treatment, 

oAt) = c„(0)+ 2 y j (96.42) 

for the approximate connexion between probability coefficients, at 
times t = Q and t = t sufficiently close together, and in the special 
language corresponding to some set of unperturbed energy eigenstates. It 
is also of interest to compare the general connexion between probabilities 
at t = 0 and t = t given by (96.39) with the special connexion given in 
the preceding treatment by (96.12), where it is to be remembered that 
(96.12) applies only when the system starts out in a single eigenstate 
£ at t = 0. Nevertheless, it must also be remarked that the simple final 
form of the general and precise results of the present treatment has been 
obtained in part by the device of introducing simple symbols for quanti- 
ties which in practical applications would really be of a complicated 
character, and that the development of a satisfactory physical intuition 
for the significance of the present results may seem difficult. 

In studying the temporal behaviour of ensembles of isolated systems, 
we shall first make use of the results of the preceding treatment, and 
shall be led to the easily appreciated — but special and only approxi- 
mately valid — notion of transitions between different conditions which 
take place with a probability proportional to the time. In our final 
studies of temporal behaviour, however, we shall make use of the 
results of the present treatment, and shall be led to less transparent, 
but nevertheless general and precise methods for treating the changes 
in ensembles with time. 



(409) 

97. Integration of Schroedinger equation when an external 
parameter is varied 

(a) Probability amplitudes for energy states that depend on an external 
parameter. The preceding section dealt with the integration of the 
Schroedinger equation in the case of an isolated system. We must now 
consider a method of procedure which can be used in treating the tem- 
poral behaviour of a system having an external parameter which is 
itself changed with time in some pr^cribed manner, f We shall not 
have to make use of the results obtained in this section in the present 
chapter, but shall need them at a later place, in § 124 of Chapter XIII, 
when we discuss adiabatic changes in the external coordinates for a 
thermodynamic system. 

Let us consider a system having a Hamiltonian operator. 



with a form dependent on the value of some external parameter a — ^for 
example, the volume of the system — ^which we can later regard as 
varying with the time. For any particular value of the parameter a 
we can then determine a set of energy eigenfunctions a) with the 
help of the equations 

^ = ^kia)ut{q,a), (97.1) 

where the quantities Fj^{a} are the eigenvalues of the energy of the 
system with that value of the parameter a. These e^enfunctions may 
be taken as forming a complete, normalized, orthogonal set. For the 
given value of the parameter a, the solution of Schroedinger’s equation 
for the system, 

could then be expressed, as a superposition of truly steady state solu- 
tions, in the form 

0 = 1 (97.3) 

where the quantities are a set of constants which will in general be 
complex numbers. 

Let us now consider that the external parameter a instead of being 
held constant is made to change with time in accordance with some 

t The n^thod of treatment \riU be similar to that of Giittinger, Zeirt. /. Phyt, 73* 
169 (1931-2). 

3596.25 


30 



410 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 

prescribed law. The Hamiltoman operator itself would then be a func- 
tion of the time through its dependence on a, and we should have to 
write Schroedinger’s equation in the form, appropriate for non-con- 
servative systems, 

As a form of solution for this equation we shall then find it profitable 
to take the superposition expressed by 

^ (97.6) 

where at each time t we use the steady state solutions that correspond 
to the instantaneous value of a that prevails, and now allow the 
quantities Cj(,(0 to change with time in whatever manner is actually 
demanded by Schroedinger’s equation. This proposed solution may 
also be expressed in a convenient form by combining the two factors 
giving an explicit dependence on t, by substituting 

Ok{t) = (97.6) 

This then gives us 0 = “)• (97.7) 

The quantities Cp.{t) are then the probability amplitudes for the different 
states k that correspond to the instantaneous forms of the eigenfunc- 
tions %(?,«). 

To determine how these probability amplitudes depend on the time 
we may now substitute the proposed solution (97.7) in Sehroedinger’s 
equation (97.4). Noting (97.1), this gives us 

2 ^ 1 2 

k 

or carrying out the indicated differentiation this can be written in 
the form 

where the last term takes care of the change with time in the eigen- 
functions which we are using. Multipljring through by 

and integrating over the configuration space, making use of the 
normalization and orthogonality of the eigenfunctions a), we can 
then obtain 

0nM+~O^{t)EJa)+a ^ m J KM dq = 0. (97.10) 



§ 97 VARIATION OF AN EXTERNAL PARAMETER 411 

Before making use of this equation it will be desirable to consider 
the nature of the integrals appearing in the last term. 

Let us jGbrst consider the case fc = n. We should then evidently be 
able to write 

J K (?. «) j ^ ^ ^ 

since the eigenfunctions are normalized to unity for all values of the 
parameter a. Hence the integral of interest could in any case only be 
a pure imaginary quantity 

f dq = ij)„(a), (97.12) 

where i)„(a) is real. This result will then make it possible to choose 
the phase factor for our eigenfunctions so that the integral in question 
would vanish. To see this, consider the new eigenfunctions 


m 

Un(q,a) = (97.13) 

which would also be solutions of (97.1). Our integral would then take 
the form 

f U*{q,a)^UMdq= f 

= ip,(«)-ip»(a) = 0, (97.14) 

with the help of (97.12). For simplicity we shall assume that our eigen- 
hmctions have been chosen in such a way as to secure this result. 
Hence the summation in (97.10) will only have to be taken over states 
where k 7 ^ n. 

Let us now consider the values of such integrals when k is not equal 
to n. Differentiating equation (97.1) with respect to a, we can write 




= o)+ (97.15) 


Aad multaplying this through by and int^rating over con- 

figuration space, we then obtain from the normalization and ortho- 



412 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 


gouaJity of the eigenfunctions, and from, the Hermitian character of 
the operator H, 

[S] J «»(?.«)^%(?.«) dq, 

”* (97.16) 

where the indicated naatrix components are defined by 

Solving (97.16) then allows us to calculate the desired integrals from 

J dq = • (97.18) 

Substituting (97.18) into (97.10), and remembering that we have to 
sum only over states k ^ n, we may now write our equation for the 
time dependence of the probability amplitudes in the form 

0,W+^'c.W*,+» 2 = 0. (97.19) 

k^n " '^nh 


(6) Gradual change in parameter. We are now ready to consider the 
effect of a very slow change in the parameter a on the probability for 
finding the system in different states n. For this purpose it is con- 
venient to transform (97.19) by substituting 


GJf) 



(97.20) 


Since the quantities FJf) have the same moduli as the C^{t), they will 
be just as useful to us in determining the probabilities for the different 
states n. Introducing (97.20) into (97,19), this can now be written m 
the simplified form 


1(^n 



^k—^n 



= 0 , 


(97.21) 


or, introducing the abbreviation 


we have 


^n~^k = 


t 



(97.22) 


We now wisJi to use this equation to calculate the change in when 
the parameter a changes by a prescribed amount, say from to Uq+Ao, 



§97 


VARIATION OF AN EXTERNAL PARAMETER 


413 


at a v ani s hi ngly small rate a. Without loss in significance Tire can take 
this rate as constant and write 


la = dT, (97.23) 

where T is the total time involved in the change. With a fixed change 
Aa in the parameter, we shall then be interested in the limiting case 

d-»0 and T^oc. (97.24) 

Integrating (97.22), we have 

F„[t)-F„{o)=dy rr^i ( 97 . 20 ) 

and must now estimate the value of the right-hand side of this equation 
at the limit T-i-cc. 

To do this we note that the integrals involved on the right-hand side 
of (97.26) can be re-expressed with the help of a partial integration in 
the form ,, 




hv, 


nk 


= r^] A 


d rpfli 


(97.26) 


We shall treat this expression on the assumption that 

*'«* ^ 0» is continuous, (97.27) 

throughout the process. The first term on the right-hand side of 
(97.26) is then at once seen to be finite since it is the difference of two 
quantities which are themselves recognized to be finite. To study the 
second term we may rewrite it, with the help of (97.22) and with 
the help of the circumstance that certain quantities therein depend on 
time only through their dependence on a, in the form 




0 


L ^ J„jfc L Jj, 




414 CHANGE EST QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 

In accordance with the assumption expressed by (97.27), none of the 
factors in the above integrands will be infinite. The integrals will then 
be of the order of the integration interval T, and hence the second 
term in (97.26) is itself of the order dT = Aa which is finite. Thus the 
integrals axe bounded even when T goes to i nfinit y. 

Hence, returning to (97.26), we can now write 

F^{t) = F^{0)+O(d), (97.28) 

where 0(d) indicates a quantify of the order of magnitude d. Squaring 
both sides of this equation, and noting from (97.20) that has the 

same modulus as the probability amplitude 0„ for the state n, we then 

obtain in general ^ |0„(0)|^+0(d), (97.29) 

and |Ci,(#)|* = 0(d®) (97.30) 

in the special case where the initial probability for the state n is equal 
to zero. 

Considering the limit when d ->• 0, we thus obtain the general qvmium 
mechaniad principle that the prob<d)Uiiie8 for finding a system in its energy 
eigensUdes n do not change wiih time when changes are made in external 
parameters at a vanishingly slow rate. This is the quantum mechanical 
analogue of Ehrenfest’s so-called adiabatic principle familiar in the older 
quantum theory. 

We have derived the principle with the help of the assumption 
expressed by (97.27). It is evident, however, that the principle would 
still be true if v„^ should pass through zero during the process without 
too high an order of contact. Hence we may ascribe validity to the 
principle in all but very facial cases. 

(c) Abrupt change in parameter. We must also treat the effect of an 
abrupt change, in the external parameter a, on the probabilities for 
finding a system in its different eneigy eigenstates. For this purpose 
we may return to our original equation (97.8) for the time dependence 
of the probability amplitudes C*. and write this in the form 

(97.31) 

k k 

Integrating over a time interval 0 to during which a change is made in 
the parameter from the value to Oo+^®> gives the general relation 

~ ^Ji!(®o)®*(2>®o) jT"/ 2 (97.32^ 



§97 


VABIATION OF AN EXTERNAL PARAMETER 


415 


Tn the case of an abrupt change, however, it is evident that the last 
term will go to zero as we make the time interval 0 to f shorter and 
shorter, since the integrand involved is certainly finite. Hence we 
shall have 


Oo+^a) = ^ ®o)- (97 .33) 


Multiplying by 7i*(ao+Aa), and integrating over the coordinate space, 
this then gives us OJa,+^a) = | CM (97-34) 


where the matrix elements indicated are defined by 

^nk = J <(?. ao+Aa)«i.(g, tto) dq. (97.36) 

It is to be noted that these quantities are the components of the 
transfonnation matrix connecting the two kinds of eigenfunctions which 
we have introduced as ^ciaUy appropriate before and after the change 
has been made in the parameter a. To see this let us put 


«*(ao+ Au) = 2 <(«o) (97.36) 

m 

as the development of the new eigenfunctions in terms of the old. 
Multiplying by %.(a0), and integrating over the coordinate space, we 
then indeed do obtain 


f <(«0+^«K(«o) f «m(«o)«fc(«o) 

J Wl •' 


= (97.37) 


in agreement with the definition (97.36). 

It win also be noted that the components of this matrix satisfy the 
two necessary conditions for a unitary tranaformaiion 


and 



m 


To see this we may use (97.36) to write 


(97.38) 


J <(ao+^«)%(«o+Ao) J <(fflo)%(ao) 

which, from the normalization and orthogonality of the eigenfunctions, 

m 


verifying the first of equations (97.38), and the second can be verified 
by similar methods. 

As already remarked, we shall need the remits of this method of 
treating Sohroedinger’s equation for a non-isolated system at a later 



416 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 

place, in § 124 of Chapter XIII. For the remainder of the present 
chapter, however, it will only be necessary to consider the methods of 
integrating Schroedinger’s equation in the case of an isolated system, 
which were discussed in the preceding section. 

98. Observation and specification of state in studying the change 

of quantum mechanical systems with time 

(a) Complementarity restrictions on observations. Having made a 
suitable disposition of the purely mathematical problem of integrating 
the Sehroedinger equation for the change in the quantum mechanical 
state of a system with time, we may now turn to the more physical 
question of considering the nature of the observations and specidca- 
tions of state that would be theoretically possible or actually appro- 
priate in investigating the temporal behaviour of such a system. We 
encounter quantum mechanical aspects of this problem which are quite 
difierent firom an^hing that had to be considered in the classical 
mechanics. 

In the classical mechanics, in following the temporal behaviour of 
a system, it was regarded as theoretically possible to make precise and 
instantaneous observations of the classical state of a system, at any 
time of interest, without in any way interfering with its farther be- 
haviour, and then to use the result obtained for the prediction of future 
states. In the quantum mechanics, in order to treat the temporal 
behaviour of a system, it is also regarded as theoretically possible to 
make a precise observation of quantum mechanical state and to use 
the result for the prediction of future states and of corresponding 
expectation values for quantities of interest. Nevertheless, it is no 
longer regarded as alw&yB possible to make precise observations with- 
out appreciably affecting the farther behaviour of the system, nor to 
cany them out instantaneously, since the quantum mechanics recog- 
nizes that the act of observation may involve an imcontroUable inter- 
action between system and measuring instrument, and that an interval 
of time may be needed to secure precision in the measurement of energy 
or quantities related thereto. Both of these new aspects of the situa- 
tion, due to the complementarity features characteristic of the quantum 
mechani<», have consequences for the kind of observations which may 
now be r^arded as appropriate in studying the temporal behaviour 
of systems. 

We may first consider the possible effects of observation on the 
further behavibur of a system of mterest. Such effects arise from the 



§ 98 OBSERVATION AND SPECIFICATION OF STATE 417 

circumstance that the measurement of one kind of dynamical variable 
is connected in the quantum mechanics with uncontrollable changes 
in other variables which stand thereto in a complementary relation. 
Such changes, moreover, become more and more serious the greater w© 
try to make the precision of measurement. As a consequence, it may 
then become inappropriate in the quantum mechanics to make a precise 
determination of one kind of quantity when the temporal behaviour of 
the system which is being studied is also dependent on a complementary 
kind of quantity. 

These remarks may be illustrated by an example of the kind typically 
treated by the methods of statistical mechanics. In studying the 
tendency for a sample of gas of approximately given energy content 
to distribute itself uniformly between two connecting containers, it 
is clear that the quantum mechanics would allow us in principle to 
approach a very precise observation of the initial positions of the 
component molecules, for instance by using a battery of microscopes 
illuminated by exceedingly hard y-rays. Nevertheless, the result of a 
method of measurement so 'violent* as this would be to make large 
and uncontrolled changes in the kinetic energies of the molecules, and 
thus to affect the actual rate of flow from one container to another. 
Hence, if we desire to study the rate of flow when the gas has a given 
fairly weU determined energy content, it is evident that we shall wish 
to employ 'gentler* though less precise methods of observing the initial 
spatial distribution, for instance by measuring the colours, or densities 
of the gas in the two containers. 

We may now turn to a discussion of the circumstance that instantane- 
ous observations are not generally feasible in the quantum mechanics. 
Limits on the duration of observation in the quantum mechanics arise 
as a consequence of complementarity relations involving the time, as 
given by the Heisenberg expression 

AJEAt ^ A, (98.1) 

which connects the order of magnitude of the uncertainty AE in the 
energy E of & system with the uncertainty At in the time t at which 
an observation is made on the system. In accordance with this 
Expression, if we make an observation on the condition of a system 
compatible with a specification of energy of precision AE, it will then 
be necessary in making the observation to employ an interval of time 
T which will satisfy ^ 

3H 


3595.25 


(98.2) 



418 CHANGE IN QUANITOI MECHANICAL SYSTEMS \\TTH TIME Chap. XI 

As a consequence we now see that the time interval t devoted to 
observation cannot be taken too short in the quantum mechanics, if 
we desire a given precision in our definition of the energy of a system. 
On the other hand, it will also be noted that the time interval r 
cannot be taken too long, if we desire to study a given kind of 
change which would itself take place in a period denoted by T, since 
it is evident that it will also be necessary for the time of observa- 
tion T to satisfy the relation 

t<^T ( 98 . 3 ) 

in order to carry out the study at all. This imposition of upper and 
lower limits on the length of time t devoted to observation has an 
important influence on the nature of the observations appropriate in 
studying the change of quantum mechanical systems with time. 

These remarks may also be illustrated by an example of the kind 
typically treated by the methods of statistical mechanics. In studying 
the tendency for the molecules of an enclosed sample of gas to acquire 
their equilibrium distribution of Mnetio energy as a result of collisions, 
the domical mechanics would allow us in principle to make a precise 
assay of the instantaneous distribution. Nevertheless, it is evident in 
the quanium mechanics that we must not try to make such an observa- 
tional assay too precise, since the time t needed for this purpose would 
become longer the smaller we make the uncertainty LE in the measured 
energies, and we must keep this time short compared with the time T 
which we take as an approximate expression for that involved in the 
redistribution of energies by collision. 

It will be seen firom the forgoing discussion that the complementarity 
features characteristic of the quantum mechanics have the general effect 
of making the use of too accurate methods of observation inappro- 
priate in studying the changes in quantum mechanical systems with 
time. 

(6) Approximate specifications of state in the quantum mechanics. 
Paying attention to the general character of the observational process 
and to l^e special quantum mechanical characteristics of observations 
discussed above, we may now consider the specifications of condition 
that might be appropriately made in applying the principles of 
mechanics to obtain insight into the temporal behaviour of quantizm 
mechanical systems. As in the case of the classical mechanics, it is of 
course possible in the quantum case to start with a precise specification 
of the initial state of a ^stem of interest and then apply the principles 



§98 OBSERVATION AND SPECIFICATION OF STATE 419 

of the exact quantum mechanics — ^i.e. make use of Schroedinger’s equa- 
tion — ^to predict the precise state at any later time together mth the 
corresponding expectation values for quantities of interest. Neverthe- 
less, there are a number of reasons which often make it possible to 
obtain a better insight into the temporal behaviour of quantum systems 
by starting out with approximate specifications of initial state and then 
using the principles of statistical quantum mechanics to investigate 
the temporal behaviour of a suitable representative ensemble for the 
system of interest. 

In the first place it will be recognized, just as in the classical 
mechanics, that the actual observations which we make on the initial 
states of systems are for practical reasons usually only approximate in 
character, since actual measuring instruments provide in general less 
precision than is theoretically possible. Hence, for this reason, approxi- 
mate specifications of state would have a better correspondence with 
the real situations to be studied. 

In the second place, again as in the classical mechanics, it will be 
appreciated in the case of very complicated systems composed of many 
molecules that the application of the methods of exact mechanics might 
itself become very complicated and difficult. Hence also for this reason 
it may be more appropriate to use the methods of statistical quantum 
mechanics with the accompanying approximate specifications of 
state. 

In addition to such ‘classicar reasons, special * quantum’ reasons, 
making approximate specifications of state more suitable than exact 
ones in obtaining a theoretical insight into the temporal behaviour of 
systems, can arise jfrom the previously discussed circumstance that 
approximate methods of observation may be more appropriate in the 
study of changes with time than exact ones on account of comple- 
mentarity relations. This can be most simply illustrated if we consider 
observations that might be made on unperturbed energy eigenstates of 
a system in studying its transitions from one such state to another. In 
m a k i ng such observations, limitations would be imposed on the precise 
determination of energy, as was seen above, since the time t devoted 
to observation would have to be reduced to an interval short compared 
with the period T characterizing the transitions.f In the case of an 
almost continuous energy spectrum, this can lead to an uncertainty in 
energy wide enough to include a number of unperturbed energy 

t The actual study of the periods T which would be involved in such transitions 
will be carried out in the next section. 



420 CHANGE IN QUANTUM MECHANICAL SYSTEMS ’RUTH TIME Chap. XI 

eigenstates. This would then make it appropriate in studying such 
rates of transition to regard the initial state of the system as only 
approximately specified, and to represent the initial condition by an 
ensemble of similar systems suitably distributed over the diffmrent 
tmperturbed eigenstates that could agree with the knowledge available. 
This method of procedure will be discussed in more detail in the next 
S6otion.t Similar reasons for approximate specifications may arise in 
cases where it would be natural to use other quantum mechanical lan- 
guages than those provided by unperturbed enei^ eigenfunctions in 
order to have states corresponding to the kind of initial observation of 
interest. The more general method of procedure then appropriate will 
also be discussed in what follows in § 101. 

Attention has already been called, at the beginning of the book, to 
the circumstance that the idea of the precise state of a system is in 
any case an abstract limiting concept, and hence that the methods of 
exact mechanics apply to rather highly idealized situations, while the 
methods of statistical mechanics apply to conceptual tituations in- 
volving less drastic abstraction from reality. As indicated in the fore- 
going, the appropriateness of the less abstract methods of statistical 
mechanics becomes even more evident in the quantum than in the 
classical mechanics, since the very process of obtaining the knowledge 
of precise state necessary for the application of exact mechanics might 
now actually interfere with those features of the behaviour which it 
was desired to study. Indeed, even the treatment of such a well-known 
problem as the transitions of a simple atomic system back and forth 
between the levels of a discrete and of a continuous spectrum of states 
actually involve considerations of an essentially statistical mechanical 
character, as we shall see in the next section. The special circumstance, 
that the applications of the exact quantum mechanics itself are so 
frequently concerned with statistical quantities, such as expectation 
values, has had a tendency to obscure the circumstance that an added 
incorporation of ideas, really gmmane to statistical mechanics, is made 
in the course of the treatment of such problems. 

(c) Approximate specification of unperturbed energy eigenstates. It 
will now be profitable to give more detailed treatment to the principles 
and formalism to be applied in the specification of states for quantum 
mechanical systems that are changing with time. We may begin by 
considering specifications that would be appropriate when observations 

t The above aigumeiits jiistifyiiig such a procedure were first clearly expressed by 
Pauli, Stmmerfdd FeOachrift, Leipzig, 1928, p. 30. 



§ 98 OBSEEVATIOX AXD SPECIFICATIOX OF STATE 421 

are made on the unperturbed energy eigenstates of a system, as defined 
by equations of the form 

= E%uM, ( 98 - 4 ) 

where H° is some suitable unperturbed Hamiltonian operator for the 
system. 

In accordance with our previous discussion, it would in any event 
be desirable to limit the accuracy of such observations to the extent 
demanded by the restrictions on the time used in observation which 
were expressed by (98.2) and (98.3). These restrictions may be re- 
written in the combined form 

(98.5) 

where AE expresses the extent to which the energy of the system must 
be regarded as undetermined, r is the time interval that can be suitably 
devoted to the observation, and T is the period characterizing the 
change from one condition of interest to another. As a consequence 
of this uncertainty in energy, combined with the approximate agree- 
ment between true and unperturbed energy levels for the system, it 
then becomes necessary to conclude that an observation on the system 
of the kind under consideration could not lead to a perfectly precise 
determination of its unperturbed energy eigenstate. 

In the special case of an initial observation, such that only a single 
discrete energy level corresponding to the observation happens to be 
within the range of uncertainty AJ?, it would then indeed be appro- 
priate to predict the further behaviour of the system by regarding its 
imtial condition as precisely specified by the indicated eigenstate. 
Under such circumstances, no immediate necessity for the use of statis- 
tical mechanics would arise. 

On the other hand, in the case of an initial observation such that 
a number of eigenstates corresponding to the observation are foimd to 
have energies within the range AE that has to be allowed, and, in 
general, when the observations are not suAciently accurate to distin- 
guish between states, it would no longer be justifiable to regard the 
initial condition of the system as precisely specified by any particular 
eigenstate. Under such circumstances it would then be necessary to 
turn to the methods of statistical mechanics to obtain a reasonable 
prediction as to the future behaviour of the system. 

To employ the methods of statistical mechanics in this connexion 
a suitable representative ensemble would have to be set up to corre- 



422 CHANGE IN QUANTTOI JIECHANICAL SYSTEMS \VITH TIME Chap. XI 

spond to our actual knowledge of the initial condition of the system. 
For this purpose let us take the result of the initial observation as 
showing that the actual state of the system might be equally weU 
represented by any one of a group of unperturbed energy eigenstates 
h, lying within the energy range AJ? that describes the accuracy of 
the observation. In accordance with the fundamental hypothesis of 
equal a priori probabilities and random a priori phases for quantum 
mechanical states, as discussed in Chapter IX, the initial condition of 
the representative ensemble might then be obtained by assigning equal 
mean values to the squares of the probability amplitudes for the 
various states h that lie in the group and random values to their 
phases. Noting that the probability amplitude for the state Tc would 
be related to the corresponding probability coefficients Cjj. by the simple 
expression ^ (98.6) 

and that exponents would cancel in diagonal terms, the initial state of 
the ensemble would then be described by the density matrix 


(98.7) 


i (state Te in group G^), 

Pia = OiCk= \ 

0 (state h not in group 0^. 

By studying the temporal behaviour of such an ensemble of systems 
it would then be possible to make reasonable predictions as to the later 
condition of the system of interest itself, as we shall see in detail in 
the next section. 

(d) Approximate specification of eigenstates in general. We must now 
turn to a consideration of specifications of initial condition that could 
be used in the general case of observations on the characteristic eigen- 
states for any kmd of quantity F, instead of observations restricted 
merely to unperturbed energy eigenstates. Attention has already been 
given in §84 to the methods of specifying condition that would be 
appropriate in such a general case, in order to illustrate the nature of 
the hypothesis of equal a priori probabilities and random a priori phases. 

If F is the operator corresponding to the quantity F on which initial 
observation is made, the eigenvalues J*. and eigenfunctions %(?), for 
the states k now of importance, would be determined by equations 
ofthefonn (98.8) 

We sliall be primarily interested in situations such that the initial 
observation is not accurate enough to place the system definitely in any 
single one of these eigenstates h. In accordance with our fundamental 



OBSERVATIOX AND SPECIFICATION OF STATE 


423 


§98 


hypothesis as to equal a priori probabilities and random a priori phases, 
and in agreement with our previous equation (84.5), it would then be 
appropriate to represent the initial condition of the system by an 
ensemble with a density matrix of the form 


Pki — “ 


(state h in group G^), 

0 (state k not in group G^), 


(98.9) 


if we regard the observation as equally well represented by any one of 
a group of eigenstates L\ Or, in agreement with (84.6), it would be 
appropriate to represent the initial condition by an ensemble with the 
density matrix (Fjfc-F o)» 

Pki — = Po^ ^kl (98.10) 

if we wish to emphasize a decreasing probability for values of the 
quantity F more and more removed from a most probable value JJ,. 

As a new aspect of such situations, which we now appreciate as 
applying in the case of an initial observation from which we wish to 
make predictions as to further behaviour, it may be emphasized that 
the discussion of this section has provided us with additional quantum 
mechanical reasons making approximate specifications of state often 
specially desirable on accoimt of the necessity for avoiding too ‘violent* 
and too ‘time-consuming* methods of measurement. 

(e) Remarks on the assignment of equal probabilities and random 
phases. We may now conclude this long section, on the observation 
and specification of state in stud3diig the temporal behaviour of 
quantum mechanical systems, with some general remarks on the assign- 
ment of initial probabilities and phases in constructing a suitable repre- 
sentative ensemble for a system of interest. 

In the first place it is to be emphasized that the assignment of equal 
probabilities and of random phases to certain states, as illustrated for 
example by (98.9), is to be regarded as definitely correlated with the 
nature of the initial observation. In accordance with our fundamental 
hypothesis as to a priori probabilities and phases, the assignment in 
question is to be made to quantum mechanical states that agree 
equally well with our partial knowledge of actual state. Hence this 
equality of probability and randomness of phase will apply only in 
the quantum mechanical language that is determined by the kind of 
quantity measured, and on transformation to other quantum mechani- 
cal languages this characteristic of the representative ensemble will in 
general no longer be apparent. 



424 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 

• In the second place it is to be noted that the assignment is also to 
be regarded as definitely correlated with the time at which the initial 
observation is made. As time proceeds the initial assignment of proba- 
bility and phase for eigenstates in the language corresponding to 
the original observation will in general be lost. Indeed, as we shall see 
later, we can usually expect a tendency for the distribution to spread 
over many states together with a tendency for decrease in the random- 
ness of phase. In this connexion it may be mentioned that the original 
complete randomness of phase will play a primary role in deriving this 
conclusion. 

One further point will be of importance. Although our representative 
ensemble will be set up to exhibit equal probabilities and random phases 
for certain states in the quantum mechanical language corresponding to 
the quantity which we regard as approximately observed, nevertheless 
it is evident that transformation to a language corresponding to a 
quantity which is only slightly different therefrom may produce only 
a slight disturbance in the equality of probability and randomness 
of phase. Indeed, in the frequently typical case of an approximate 
observation, which could be equally well represented by many states, 
the complete invariance towards transformation of the density matrix 
for a uniform ensemble would make us expect wide possibilities for 
transformation to other languages without greatly disturbing the equa- 
lity and randomness mentioned. These remarks are of interest in con- 
nexion with the circumstance that the performance of a measurement 
which is only approximate in character is not really sufficient for a 
unique assertion as to the precise quantity which we should regard 
ourselves as observing. We can now see, however, that slight changes 
in this respect cannot be expected to have a critical effect in determining 
the nature of omr conclusions. 

99. Time -proportional transitions 

Quantum mechanical transitions between conditions described by 
unperturbed energy eigenfunctions, which have a probability of oc- 
currence proportional to the time, are of great importance for an 
understanding of atomic and molecular behaviour and for statistical 
mechanical applications. Such time-proportional transitions are found 
to occur when at least one of the two quantum mechanical conditions, 
between whidi the transition takes place, is to be described by a col- 
lection of states belonging to a practically continuous spectrum of 
unperturbed energy eigenstates that correspond to some suitable 



§ 99 TIME-PROPORTIOXAL TEANSITIONS 425 

unperturbed Hamiltonian H°. We are now ready to treat the theory 
of such transitions with the help of the approximate integration of 
Schroedinger’s equation by the method of variation of constants as 
discussed in § 96 (6), together with the help of the principles to be used 
in setting up an initial representative ensemble as particularly discussed 
in § 98(c). 

(a) Transition from a discrete state to a continuous spectrum of states* 
Let us first consider a quantum mechanical system which has a non- 
degenerate, discrete, unperturbed energy eigenstate k with a value for 
its unperturbed energy E% that happens to lie in the range overlapped 
by a nearly continuous spectrum of non-degenerate, unperturbed 
energy levels which correspond to an observably diJfferent condition 
of the system. For example, the discrete state might be one for an 
undissociated molecule and the overlapping continuous states correspond 
to dissociation, or the discrete state might correspond to an excited 
atom and the continuous states to unexcited atom plus radiation. 

Let us then consider that we are interested in making approximate 
observations on the condition of the system which would determine its 
actual unperturbed energy eigenstate, within an energy rarige AE sub- 
ject in any case to the lower limit prescribed by our previous expression 
(98.5). Let us assume that our initial observation can be well repre- 
sented by taking the system at time ^ == 0 as actually being in the 
discrete state k, and let us then calculate the probability of finding it 
at a later time ^ in a condition corresponding to a group of G^, eigen- 
states 7^, belonging to the continuous spectrum and having a range AE 
of unperturbed energies E^ which overlap the original energy E^, 

In accordance with the initial observation we can take the original 
condition of the system at ^ = 0 as specified by setting the quantum 
mechanical probability for the state k equal to unity, 

TO) = c?(0)c;,(0) = 1. (99.1) 

And, in accordance with our previous result (96.12), we can then take 
the probability for finding the system at time t, in any particular state 
n, belonging to the continuous spectrum as given by 

Wnit) = 

where is the matrix component, corresponding to the perturbation 
which leads the system to change jfrom one rmperturbed state to 
another, and where the time 0 to 2 must be short enough to justify the 

S«I)S.S8 31 



426 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 

method of variation of constants. Summing up over all such states % 
that lie in the group of (?, states included in the energy range LE of 
interest, we can now write 

^^1{,E%-E%)t 

PAt) = I WAt) = 4 2 WnJc? (El-Eir 

ft 

for the total probability of finding the system at time t in one or another 
of these states, provided the time t is short enough to justify such 
a direct addition of probabilities. 

To evaluate this expression for the probability of finding that transi- 
tion has occurred to the final condition of interest we may now make 
use of our assumption that the levels n form a practically continuous 
spectrum. We may then write 

dn = c^dE^ (99.4) 

as a suitable expression for the number of levels in the range dE^, 
where the factor a„ can be treated as a continuous function of E^. 
Substituting this expression in (99.3) and replacing summation by 
integration over the energy range AE which we have introduced as 
corresponding to the accuracy of our observation, and which we may 
take as including E% at its centre, we then obtain 

Bl+iAE mi?'^{El-E%)t 

Pv(0 = 4 J "(El-E%)^ ' 

In order to treat this integral it will be convenient to make a change 
of variables by substituting 

y = ^(El-E%% (99.6) 

where E% is of course a constant, and we treat ^ as a constant para- 
meter which denotes some particular time of interest for observation. 
The integral then assumes the form 

t 

+ 2nr 

P,(0 = :J« J (99.7) 

friLB 

To evaluate the integral we now note that the integrand will he large 
only in a narrow range where y is approximately equal to zero, that is, 
where is approximately equal to E% This will make it approxi- 



TIME -PROPORTIOSr AL TRANSITIONS 


427 


§ 99 


mately correct to treat and cr„ as constants, assigning to them 
the values that they have at the middle of the range where = B%. 
It will also make it approximately valid to take the limits of int^ation 
as from minus to plus infinity, provided the time t is long enough so that 


. tl^E , 


or t^E ^ Ji. 

The integral then assumes the form 


m = ‘^\ynu?<=^ni J 


From a known formula of integration we now obtain 
PAt) = ^\ynlc?<^ni> 


(99.8) 

(99.9) 

(99.10) 


as the desired expression for the probability of finding that a transition 
has occurred at the time t from the original discrete state h to the 
collection of states n. We see that such a transition would take place 
with a probability proportional to the time. 

It will be noted that this result has been obtained directly from the 
quantum mechanics proper without any necessity of resorting to statis- 
tical quantum mechanics. This arises from the ofrciunstance that we 
have regarded the system as starting out in a definitely specified 
quantum mechanical state, and hence have not had to introduce any 
ensemble to represent its initial condition. It will be necessary to 
employ the methods of statistical mechanics, however, m treating the 
inverse problem of transition from the continuous spectrum back to 
the discrete state, as we shall see presently. 

The expression given by (99.10) is of course only valid when the 
approximations made in its derivation are justified. For tins to be 
the case the time t must be short enough to make the probability 
that transition has occurred small, since this assumption was involved 
in developing the method of variation of constants in order to integrate 
(96.4), and was again involved above in adding the probabilities for 
different states n in order to obtain (99.3). On the other hand, the 
time t must be long enough to satisfy the relation (99.8), 

fAF>h, (99.11) 

which was used in justifying the limits of integration ±oo in (99.9). 

Furthermore, if we desire an observational test of our predictions. 



428 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 


it is evident that t must be of the order of the time T necessary for an 
appreciable change to occur, which by (99.10) can be taken as 


T « 


Tt 


(99.12) 


Since (99.11) must at least be satisfied by such a time T, we have 




(99.13) 


as an evaluation of the extent to which the energy of the system must 
be left undetermined. On the other hand, LE must, of course, also be 
small enough so as to permit a distinction of the discrete level 1c firom 
its neighbours. It will be seen that the success of the method depends 
on the separation of the Hamiltonian H for the system into an imper- 
turbed Hamiltonian H® and a suffi-ciently small perturbation term V. 

There are a number of important atomic processes which can be 
described as a transition fi:om a nearly stationary discrete state to a 
continuous spectrum of states of overlapping energy, and which are 
actually found to take place with a probability proportional to the 
time. These include the transition of atoms from excited states to lower 


levels with the emission of radiation into a narrow range of the possible 
continuous radiation spectrum, the radioactive decay of excited nuclei 
with the emission of alpha particles into a narrow range of the possible 
continuous spectrum of kinetic energies, the dissociation of excited 
atoms into ion plus electron (Rosseland-Auger effect), the dissociation 
of excited molecules into smaller constituents (predissociation), and 
certain scattering processes occurring in the case of atomic collisions 
which can also be described in the above language. 

The actual detailed treatment of these processes is often more com- 
plicated than indicated above. In particular a simple, appropriate 
separation of the Hamiltonian into an unperturbed Hamiltonian plus 
a perturbation term may not be feasible. Nevertheless, by proper 
handling, differential equations of the type (96.4) can often be obtained 
which depend on the elements of a suitable Hermitian matrix even 
though this may not be immediately related to a simple perturbation 
operator V. The ftmdamental aspects of the calculation are then not 
greatly altered. 

(6) Transition from the continuous spectrum back to the discrete state. 
In the preceding we have investigated the time-proportional transitions 
from a discrete state Jfc to a condition which could be described by a 
group of stat^ n belonging to a continuous spectrum and lying in 
a range h.E of unperturbed energies E% which overlaps the original 



TIME-PROPORTIOXAL TRANSITIONS 


429 


§ 99 


energy We now Tivish to consider the inverse process of transition 
back to the discrete state. 

For this purpose we shall now assume that the initial condition of 
the system at time ^ = 0 would be one in which the system would 
certainly be found in one or another of the individual states n which 
we take as lying in the specified range AjE. We then have unit probabi- 

Uty at the start, ^ Tr„(0) = 2 c?(0)c„(0) = 1, (99. 14) 

n n 


for finding the system in one or another of the states n, where we sum 
over all the states, together with zero probability, i.e. 

CjAO) = 0, (99.16) 


for being found in the discrete state Is. Making use of our general 
formula (96.9) for the values of probability coefficients as a function 
of time, and noting (99.15), we can then write 


together with 


2m. 


c*it) = ^ 


(99.16) 


(99.17) 


as expressions for the probability coefficients at a later time t corre- 
sponding to the discrete state k, where we take summations for n and 
n' over all the O, states originally occupied. Multiplying (99.16) and 
(99.17), we then have 

= cUtMt) 

(99.18) 

for the probability of fi n ding the system back in state k at time t = t. 

The above summation could evidently be evaluated only on the basis 
of some definite assignment of values for the initial coefficients c*-(0) 
and c„(0). This assignment, moreover, could actually be carried out 
in a great variety of ways, aU compatible with the initial condition 
described by (99.14); and a great variety of actually different results 
could be obtained. Hence we shall now be interested in turning to the 
general point of view adopted in statistical mechanics, and making 
some estimate as to what might be expected as to the average behaviour 
of the system in a situation such as we are considering. Such an appeal 


2 

n',n 



430 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 


to statistical methods is indeed made definitely necessary by the circum- 
stance that our original eon.dition is not sufficiently weU defined for a 
precise specification of initial state. 

To carry out the application of statistical methods we may now 
calculate the mean behaviour of the systems in an appropriate repre- 
sentative ensemble for the system of actual interest. For this purpose, 
in accordance with the postulate of equal a priori probabilities and 
random a priori phases, and in agreement with the discussion leading 
to (98.7), we may take the initial state of the representative ensemble 
for our system at time i = 0 as described by the density matrix 


/>«»' = c*.{0)c„(0) = 


Mg 

gOn'n 


(state v, in group 0„), 
(state n not in group <?„), 


(99.19) 


where the double bar, as usual, signifies a mean for all the members 
of the ensemble. Combining with (99.18), we shall then obtain the 
simplified expression 




n 




El-El 
sm^^iEl-EDt 
{El-Elf ’ 


El-El 


(99.20) 


for the probability of finding a system in the ensemble in state fe at 
time t, where the summation is over all states n in the group 0^. 

This sunamation may now be handled in the same manner as the 
similar expression (99.3), occurrir^ in the preceding example, by taking 
dn = o^dEn as the number of states in the range dE^, replacing sum- 
mation by integration, and introducing suitable approximations. We 
are thus led to , ^ 

(99.21) 

as a suitable approximate expression for the probability of finding that 
a transition has occurred at time t from the condition described by the 
origiaal collection of states in the range A.E? back to the discrete 
state k, where and are to be given the values which they assume 
when El = El. 

This result may now be compared with our previous expression 
.(99.10) giving the probability for the inverse transition from state k to 



§99 TIME -PROPORTIOiTAL TRANSITIONS 431 

the group G^. Since we can take the number of states corresponding 
to the initial condition as unity, this may be written in a form similar 
to (99.21), 

We see that both kinds of transition would take place with a proba- 
bility proportional to the time. Furthermore, since the Hermitian 
character of the matrix elements ynk will give us 

\v,„\^ (99.23) 

we note that the probabilities per unit time for the two transitions 
differ only by the factors and which give the number of states 
in the two conditions of interest. The approximations necessary for the 
validity of (99.21) are similar to those which were previously discussed 
in connexion with the relation now given by (99.22). 

(c) Transition from one group of continuous states to another. The 
foregoing treatments which are due to Paulif are snJB&cient to iQustrate 
the general character of time-proportional transitions. With systems 
composed of many molecules or other elements, however, as is often 
the case of tj^pical interest in statistical mechanical studies, we can 
expect an unperturbed Hamiltonian operator for the system to provide 
energy levels all of which will in general be practically continuous 
rather than discrete in character. Hence we may now supplement the 
discussion of transitions between a discrete state and a group of con- 
tinuous states by a discussion of transitions between two such groups 
of continuous unperturbed energy eigenstates. 

To carry out the treatment we may take Gx, etc., as being 
the heights’ or numbers of individual states k, Z, m, etc., occurring in 
the different groups of interest. In any particular problem we shall 
take the different groups as corresponding to observably different 
situations, but as each corresponding to the seme given range in energy 
E to JS^-1 -AjE?, which in any case cannot be taken too narrow owing 
to the finite time t allowable for observation, as already discussed in 
§ 98(c). 

We must now investigate the probability for transition between the 
conditions specified by one such group of states k and another of states 
n. For the sake of generality we shall now allow for the possibility that 
the eneigy levels E% and E^ of the kind present in the two spectra may 


■f Pauli, SommerfeM Festschrift, Leipzig, 1928, p. 30. 



432 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 


be degenerate, and sball denote the individual unresolved states for 
a given level by symbols of the form 

fo- (r = l,2,...,srft) and (s = 1, (99.24) 
where and are the respective multiplicities of the two kinds of 
level. The symbols and c^{t) will be used as expressions for the 
probability coefficients for the different individual eigenstates in the 
two groups. We may then take the initial condition of our system at 
time f = 0 as corresponding to 

I = 2 c^(0)c*,(0) = 1. (99.25) 

K,r 

together with = 0, (99.26) 

where the summation is taken in (99.26) over aU the eigenstates in 
the first group, and (99.26) must hold for all eigenstates in the second 
group. This will then put our system at time t — Q definitely in the 
condition corresponding to the group of states h, and we may now 
calculate the probability for finding it in the group of states % at a later 
time t = t. 


Making use of our general formula (96.9) for probability coefficients 
as a function of time, and noting (99.26), we can write 


*.r 






(99.27) 


as an expression for the value at time t of the probability coefficient 
for any particular state ns in the second group of (?„ states, where 
is the mdicated element of the perturbation matrix and the summa- 
tion is to be taken over all states of the first group. Similarly, for 
the complex conjugate of this quantity we can write 

cm = 2 (99.28) 


Multiplying (99.27) and (99.28), and summing over aU Q, states ns in 
the second group, we then obtain as the total probability for finding the 
system at time t in the condition corre^onding to these Gy states 

iwm^icmcm 

n,8 

X 

, — #f;v. — 

ntS k,r 


F* F 

^ nsj^r* ^TVBjcr “ 


-El 


^ ~E%-El 



§99 


TIME-PROPORTIOXAL TRANSITIONS 


433 


As in the earlier example provided by (99.18), this expression could 
be given a specific evaluation only on the basis of some specific assign- 
ment of values to the initial coefficients Cjtr(O). This we do not have, 
however, since our knowledge of the initial situation is only sufficient 
to give us (99.25), which could be satisfied in many different ways 
leading to many actually different results. Hence, as previously in 
§ 99 (6), we must resort to methods of an essentially statistical mechani- 
cal character, and try to find what we might expect on the average. 
To do this we may again calculate the mean behaviour in an appro- 
priate representative ensemble, set up in agreement with the postulate 
of equal a ^priori probabilities and random a priori phases, see (98.7), 
so as to have the initial density matrix, at time i = 0, 


Pkr^kY — — 


TT^kfr'^r 

0 


(state kr in group <?^), 
(state kr not in group 0^), 


(99,30) 


where the double bar once more signifies a mean for all the members 
of the ensemble. Combining with (99.29) and changmg to trigonometric 
form, equation (99.29) is then seen to give 


Pvit) = 2 ^ 2 2 

« » ^ n,s k^r ' ” 


(99.31) 


n,8 


for the probability of finding a system in the ensemble in the group of 
Oy states at time where the summations are now over all states ns 
and kr of the two groups. 

To evaluate this summation we may now take 


and 




(99.32) 


as expressions for the numbers of individual states of the kind indicated 
in the energy ranges and dE%^, Substituting in (99.31), and re- 
placing an appropriate part of the summation by integration, this then 
gives 

A {Hi- m)t 

Hv{^) = J J (^0_^g)2 

**“1 E E 

(99.33) 

where certain factors have been taken outside the integral signs since 

3595.25 3 ^ 



434 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 


they may be regarded as constant over the range E to E-\-LE selected 
as corresponding to observation. 

To perform the integrations indicated in (99.33), we may first in- 
tegrate with respect to jK® holding E% constant. In doing so, we may 
then again assume, as in the treatment of § 99 (a), that for a time t 
which is not too short the quantity {E^—E%) can be assumed to vary 
over the whole range firom minus to plus infinity. Evaluating the first 
integration with the help of this assumption, and then carrying out the 
second integration between the actual limits given, we then obtain 
the desired expression 


m = 2 2 (99.34) 

K s=ir=l 


for the probability of finding a system of the ensemble at time t in one 
or another of the Oy individual states belonging to the final condition. 
We again obtain a probability of transition which increases linearly 
with time. The approximations made in the derivation are similar to 
those already discussed in § 99 (a); and although transitions to unper- 
turbed energy states lying outside the energy range E to E+AE are 
not absolutely prevented by the quantum mechanical form of the 
energy principle, they will nevertheless take place with a very small 
probability owing to the effect of the resonance denominator in 
(99.33). 

In a similar manner, starting out at time ^ = 0 with a representative 
ensemble corresponding to a system in the condition given by the group 
of Qy states of the kind n, we should be led to an expression of the 
form 


m = ^ X z 2 


s*=l 1 


for the average probability of finding the condition corresponding to 
the Qy. states of the kind h, at t = t. We note that there will be a simple 
relation between the probabilities for the two inverse transitions, owing 


to the equality 


IW= IWP 


(93.36) 


arising from the Hermitian character of the perturbation operator V. 

(d) General formulation of transition probabilities. It will now be 
desirable for later use to give a somewhat more general formulation to 
the foregoing results on time-proportional transitions. In accordance 
with all of the expressions — (99.21), (99.22), (99.34), and (99.35) — which 
we have obtained for such transitions, it will be seen that we can repre- 



§99 


TIME-PROPORTIONAL TRANSITIONS 


435 


sent the average probability of finding a system in a condition v at time 
t = t, which starts in a condition k at time ^ = 0, by an expression of 
the form _ 

(99.37) 

where is the number of individual imperturbed eigenstates k corre- 
sponding to the condition k, and is an abbreviation for certain 
factors that we do not now need to express in detail. Similarly, for 
the inverse transition we shall have an expression of the form 

P^{t) = (99.38) 

where we can write (99.39) 


on account of the Hermitian character of the perturbation matrices 
V^j. and which determine the probabilities of transition. The two 
expressions (99.37) and (99.38) give the probabilities at time t = t of. 
fin di ng a system in the group of states O.^ or in a representative 
ensemble which has been set up initially at time i = 0 to correspond 
to a system of interest in the group of states or respectively. 

We may now consider somewhat more general possibilities for the 
imtial condition of the system of interest by assuming that the original 
observation at time ^ = 0 is not such as to place the system definitely 
in a single group of states but such that we assign the probabilities 
to different groups of states (3^, etc., in number, 
each of which corresponds to the range in energy fimi-hing our 
observations. The initial condition of our representative ensemble can 
then be described by taking P^, etc., as the total probabilities 

for each of these groups, and by assigning equal probabilities and 
random phases to the individual states within each such group. 

To treat this more general case it will now be convenient to define 
a new quantity characterizmg transitions by the expression 

‘^Kv “ rt n (99.40) 


(99.41) 

from the previous equaUty (99.39) which results from the TTarmitinT^ 

character of perturbation matrices. Making use of (99.37), we can 

evidently write for the rate of transition in the representative ensemble 

from condition k to v a ^ n 

= (99.42) 



436 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 

and making use of (99.38), can write for the rate of inverse transition 
from condition v to k the symmetrical expression 

= (99.43) 

where these new quantities are to be understood in the sense that 
and Z^^ express those contributions to the total rates of change in 
and Py which are to be ascribed respectively to transitions in the 
ensemble from condition k to v and from condition v to k. 

It win be noted that the validity of the above expressions depends 
on the assignment of random phases to the states in the original repre- 
sentative ensemble. It is this circumstance which permits us to set the 
probability for transition from condition k to v in the ensemble under 
consideration equal to a product of the probability for such a transition, 
in an ensemble where k is the only condition represented, multiplied 
by the probability P^ for that condition in the actual ensemble con- 
sidered. Hence these expressions only apply at the time of an initial 
observation which can be represented in the manner described. 

The results given by (99.42) and (99.43), together with the equality 
of and A^,f given by (99.41), wiU prove very important in deriving 
the quantum mechanical H-theorem in the next chapter. 

100. The probabilities for transition by collision in Fermi -Dirac 
and £instein-Bose gases 

The changes in condition of a dilute gas as a result of molecular 
interaction or collision provide important special cases of quantum 
mechanical transitions which take place — at least in first approxima- 
tion — with a probability which is proportional to the time. We shall 
devote the present section to the treatment of such transitions, applying 
methods similar to those developed in the preceding section both to 
the case of a system of Fermi-Dirac particles, characterized by anti- 
symmetric eigenfunctions, and to the case of a system of Einstein-Bose 
particles, characterized by symmeiric eigenfunctions. We shall thus be 
able to obtain the quantum analogues of the classical expressions for 
the probability of collisions which were used in deriving the classical 
form of Boltzmann’s ff-theorem. In the next chapter we shall use these 
quantum analogues in deriving a quantum form of £f-theorem. 

In order to treat these problems, using the language of the method 
of variation of constants, we shall regard the true Hamiltonian operator 
H for the gas as separated into two parts 

H = H»-f V, 


( 100 . 1 ) 



§ 100 


TKANSITION BY MOLECULAR COLLISION 


437 


where the unperturbed Hamiltonian H® corresponds to the energy of 
the particles neglecting their interaction, and the perturbation operator 
V corresponds to the remaining energy associated with the forces of 


interaction between them. Equations of the form 

= El ( 100 . 2 ) 

where represent the coordinates for the n particles composing 

the system, will then determine the unperturbed eigenfunctions and 

unperturbed eigenvalues of the energy El for the different eigenstates 

Ic to be used in describing the condition of the system. And equations 

of the form , ^ 

‘ F„, = (100.3) 


win determine the elements of the perturbation matrix involved in the 
transitions from one eigenstate k to another eigenstate n which result 
from the interaction or collision of particles. 

It will first be necessary for us to obtain more specific expressions 
for these elements of the perturbation matrix. This will involve some- 
what lengthy and complicated calculations, which will be carried out 
separately for the Fermi-Dirac and Einstein-Bose cases. We start with 
the Fermi-Dirac case, which is the simpler of the two. 

{a) Perturbation matrix for the interaction of Fermi-Dirac particles. 
In accordance with our previous treatment as given in § 76 (d), the anti- 
symmetric unperturbed eigenfunctions for a Fermi-Dirac gas, composed 
of n similar particles in a container, would be of the form 


-in) = ^ 2 (=Fl)P«*(!/lK(? 2 )»m(?s) •••«)•(?») 

p 

= ^2 (=Fi)P„n %(?«). (lOM 

Pflj A? 

where the second form of writing will serve as a convenient abbrevia- 
tion. The quantity A occurring in the above expression is an appro- 
priate normalizing factor, the symbols represent individual 

eigenfunctions for a single particle in the container in question, the 
symbols stand for the coordinates of the n particles, and the 

formalism indicates that we are to take a summation over aU permuta- 
tions P of the paxtiole indices a == 1, using the minus or plus sign 
according as the permutation is odd or even. As already discussed in 
§ 76 (d), it is evident that the individual eigenfunctions appear- 

ing above would all have to be different from each other m order that 
the total expression given by the summation should not reduce to zero. 
Hence the summation expressed by (100.4) will contain n\ terms all of 
which are different. 



438 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 

We must begin by determining the magnitude of the factor A, which 
will be necessary in order that such an eigenfunction for the system 
as a whole shall be properly normalized to unity, that is, in order to 
secure the result 

^ J - J ^*i9l-9nWaiii-qn)^l-dqn 

= A*A J - J ^ (=Fl)Pa n 1 (Tl)Pa ^ uM dqi ...dq^. 

(100.5) 

To calculate this value let us first consider a particular term, say 

%{?l)“/{?2)«m{?8) -Min), 

of the Til different terms occurring in the second summation appearing 
in the above integral. Then it is evident that there will be one and 
only one term occuiriDg in the first summation, namely 

(?2X(ffs) •••»?(?»). 

which has the same assignment of coordinate and eigenfunction indices 
in the different factors that make up the term. Hence, making use of 
the orthogonality of the different individual eigenfunctions it 

is evident that each of the n\ terms in the second summation will be 
multiplied by only one term in the first summation which will lead to 
a result other than zero on integration. And, furthermore, making use 
of the assumed previous normalization of the individual eigenfunctions, 
it is evident that such a pair of terms will give the result unity on integra- 
tion. As a consequence we at once see that the normalization (100.6) 
can only be secured by assigning to the normalizing factor A a magni- 
tude such that , , 

= or = (100.6) 

We are now ready to proceed to the determination of the elements 
of the perturbation matrix, corresponding to a transition from one such 
eigenfunction for the system as a whole to another. To make the cal- 
culation we shall assume that the perturbation operator, corresponding 
to the mutual energy of interaction between the particles, would be 
given by an espression of the form 

V(ffi ... g„) = i 2^), (100.7) 

where expresses the mutual energy of two particles as a func- 

tion of their coordinates, and we take the appropriate expression for 
the result of summing over all pairs of particles. The expression 



TRANSITION BY MOLECULAR COLLISION 


439 


§100 


neglects, for our present case of a dUute gas, the necessity of including 
terms depending on the coordinates for more than one pair of particles 
at a time. 

Using the above expression for the perturbation operator, we should 
then have an expression of the form 




= U'*A f 


J 


f 2 (T IT 1 Sy) X 

j Pa A!' 


X 2 l)Pa n %•(?«) (100.8) 

Pa k 

for the element of the perturbation matrix corresponding to the transi- 
tion &om the state corresponding to one eigenfimction Ua to the state 
corresponding to a different one C^.. 

To evaluate this expression we note in the first place, as a result of 
the orthogonality of the individual eigenfunctions, that the expression 
would in any case reduce to zero if more than two of the eigenfunctions 
%'(?) appearing in should differ fi:om any eigenfunction in £^. 
Two eigenfimctions in 17^ could be replaced by diffesent ones in Ug>, 
however, without getting a null result, since they could be assigned the 
coordinates and which appear in V{qp,qy). We shall consider that 
%(g) and Ui{q) in Ua are replaced by u^{q) and «„(g) in We can 
then write (100.8) in the more explicit form 

VmnM = J •" /| (Tl)P<(S'lX(?2)«?(<f3) -<(?JX 

X 2 1^(?JS. 2y) 2 (Tl)P«*(2i)«j(2aK(23) - “r(2») (100.9) 

P 

where the individual eigenfunctions Ug,..u,. belongmg to the total eigen- 


.u; 


fimctions are matched pair by pair with the eigenfunctions u* . 
in UJ. 

To continue with the evaluation, let us now consider a particular 


term, say 


%(2iK(?2K(2s) ■••«r(2»). 


( 100 . 10 ) 


of the different terms appearing in the second summation over the dif- 
ferent permutations of the particle indices in (100.9). There are in all n\ 
su^h terms. 

Having chosen this term, let us next consider a particular assignment 
of values to )S and y, say 

jS = 1, y = 2, (100.11) 

which can lead to a non-vanishing result on integration. There are in 



440 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 


all two such assignments (fi = 1, y = 2) and (jS = 2, y = 1) which 
would lead to the same final result. 

Having made the above assignment for j8 and y, let us finally pick 
out the terms in the first summation over the pennutations of particle 
indices, which would lead to a non-vanishing result. These are seen 

and tea) - <(?»)• 


Taking account of the numbers of possibilities given above for dif- 
ferent choices which would lead to an end result of the same magnitude, 
and making use of the normalization of the individual eigenfunctions 
we now see that (100.9) will reduce to 

= \A'*An\ 2 JJ [<(?iX(?2)-«m(?2X(?i)]T^(?i.32)%(?iK(?a) 

(100.13) 

Noting the magnitudes of the normalizing factors A' and A given by 
(100.6), and using I-^ to denote the so-called ‘direct integral’ 

k = // <(3i)«2(?2)^{2i. ?2K(3i)«<(?2) dqi dq^, (100.14) 

and Jg for the so-called ‘exchange integral’ 

k = J/<(?2)<(?i)T^(?i.?2X(?iWff2)<^gi<^?2. (100.15) 

we then obtain as the desired expression for the square of the absolute 
magnitude of the above matrix element 


\%.nM\^=\h-k?- (100.16) 

In accordance with our development of the method of variation of 
constants — note, for example, equation (96.12) — ^it will be seen that 
this is the quantity which would determine the probability of a colU- 
sional transition in which the numbers of particles in the individual 
eigenstates It, I, m, and n change in the manner indicated by 


= 1 \ /»*=«, = 0 \ 
= »„ = 0 / \n^ = = 1 / 


(100.17) 


where, in agreement with the Pauli exolu^on principle, the only possible 
occupation numbers axe zero and one. It will be noted, moreover, that 
our derivation of (100.16) only appli^ to a case where the initial situa- 
tion is that described by (100.17), and that the value of Wrm>.w\^ would 
reduce to zero with any other choice for the initial occupation numbers. 



§ 100 TRANSITION BY MOLECULAR COLLISION 441 

Hence we may now write as an entirely general expression for this impor- 
tant quantity, in the case of Fermi-Dirac particles, 

(100.18) 

where the initial occupation numbers appear in the added factors in 
such a way as to give 1 or 0, according as the initial situation does or 
does not agree with that given in (100.17). 

(b) Perturbation matrix for the interaction of Einstein-Bose particles. 
We must next determine the analogous expression for the square of 
the perturbation matrix in the case of Einstein-Bose particles. The 
computation is somewhat more complicated than for the Fermi-Dirac 
case, but wiU be carried out so as to nm parallel to the treatment just 
given which will help to make it understandable. 

In accordance with the discussion given in § 76(d), the symmetric 
eigenfunction for an Einstein-Bose gas composed of n similar particles 
may be expressed in the form 

= ^2PnK(3)]"‘. {100-19) 

P k 

where it is hoped that the second manner of writing will serve as a 
convenient and informative abbreviation. The quantity A, occurring 
above, again represents an appropriate normalizing factor, the symbols 
represent individual eigenfunctions for a single particle, the 
symbols stand for the coordinates of the n particles, and the 

formalism indicates that we are to take a summation over all permuta- 
tions P of the particle indices 1 ,..., n. The form of writing in (100.19) 
gives specific expression to the possibility of now having more than one 
particle assigned to the same individual eigenfunction (e.g. % to Uj^), 
without a reduction of the whole eigenfunction to zero. 

We must again begin by determining, for the present case, the magni- 
tude of the factor A which will secure the proper normalization, 

1 = J... J 

= ^*.4 f ... f 2 P IT [<(2)]"* I P n [%(2)]"* Mx - (100.20) 

J P fc P Jb 

To calculate this value let us first consider a particular term, say 

- %(?«>K?«*+i) - “/(2 »*-hi,) - «r(2n). 

of the n\ successive terms which make up the secorid summation appear- 
ing in the above integral. In accordance with the normalization and 



442 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 


orthogonality of the individual eigenfunctions, it is then evident that 
if -we multiply this by the similar term 

occurring in the first summation and then integrate we shall obtain 
the result unity. Furthermore, it is evident that we can carry out 
%!»,!...%! permutations of the particle indices in this latter term 
before we reach a situation where the orthogonality of the different 
individual eigenfunctions would lead to the result zero. Hence, since 
there are in all n! successive terms in the second summation over the 
permutations P, we see that the normalization (100.20) is to be obtained 
by assigning to the factor A a magnitude such that 


A*A = 




or \A\ = 


(»!%!»j! ...TO,.!)*' 


( 100 . 21 ) 


We are now ready to proceed to the determination of the elements 
of the perturbation matrix corresponding to the transition from one 
such eigenfunction Ug for the system as a whole to another In the 
present case these elements will be given by an expression of the form 
Vg-S 

= i [...[ Z7*,(g'i...gJ 2 •••?„) •••«??» 

J J 

= J - J I p ip[ K(?)]"*' 2 S-y) P n K(?)]"* -dqn, 
^ ^ ^ ( 100 . 22 ) 
where we again take i 2 P(?8j?y) S'® 8^ appropriate expression for the 
perturbation operator. 

To evaluate the above, we a^ain begin by noting that the result 
would reduce to zero, as a consequence of the orthogonality of the 
individual eigenfunctions, if more than two eigenfunctions UjJiq) are 
changed to different eigenfunctions ^^/(j) when we pass from Ug to U^. 
Two such eigenfunctions, however, let us say % and TOj, could be re- 
placed by different ones, say and •«„, in passing from Ug to V^, 
without leading to a null result since these could be the eigenfunctions 
to which we assign the coordinates qp and q^ appearing in V{qp,qy). 
Making such a replacement, the transition from Ugto Ug- could then be 
described by the formalism 






(100.23) 



§ 100 TRANSITION BY MOLECULAR COLLISION 443 

And we can tlien write the element of the perturbation matrix in the 
more explicit form 

ymnM = \A'*A J ... Jl 

2 2y) 2 PK(2)]"*W2)M«m(g)]”'" X 

p 

X [«n{2)]"-[«<,(2)]"‘' ••• K(2)]"' (100.24) 

where the individual eigenfunctions from o to r are matched pair by 
pair in the expressions for Ug and V*. 

To continue with the evaluation, let us now consider a particular 
term, say 

«&{?l) - ••• «i(?n*+n>o(?n,+»,+l) - «r(2n). (100.25) 

of the successive terms appearing in the second summation over the 
different permutations of particle indices in (100.24). There are in all «.! 
such successive terms, although some of them can be equal to one another. 

Having chosen this term, let us next consider a particular assignment 
of values to /3 and y, say 

)8 = 1, y = njfc+l, (100.26) 

which can lead to a non-vanishing result on integration. There are evi- 
dently 2 %Mj such assignments in all, since )3 could be taken anywhere 
in the range 1, with y in the range »*+!, ..., %+%, or vice versa. 

Having made the above assignment for j8 and y, let us finally count 
the number of terms appearing in the^raf summation over the different 
permutations of particle indices in (100.24) which would lead to a non- 
vanishing result on integration. Evidently there will be 

(«^_1)! („.^+i)! (w„-f l)!^„! ...«,! (100.27) 

such terms, in which appears in uX{q) with in «*(?), and the 
same ntunber with q^ in u*{q) and in «„(?). 

Hence, taking accoimt of the numbers of possibilities given above 
for the different choices which would lead to an end result of the same 
magnitude, and making use of the normalization of the individual 
eigenfunctions, we now see that (100.24) will reduce to 

^muM = 2»*«,(n*-l)!(n,-l)! (7i„-f 1)! («„-f l)!n^! ...?v! x 

X JJ [<(?i)Mn(?s)+<(22X(?i)]^(2i.22K(?i)«i(22)‘^i<^!2a- 

(100.28) 

Introducing the magnitudes of A‘ and A as given by (100.21), and using 
our jffevious symbols Zi and for the ‘direct’ and ‘exchange integrals’ 



4i4 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 

(100.14) and (100.16), this then gives us for the desired square of the 

magnitude of this quantity 

I V |2 
I ^mn,kll 

_ [Tt!%TC;(%— 1)! (Wj— 1)! (»»t+l)! (?^„+l)!Wo! ... TO,!p|/i+7aP 

(»!)a(»i,-l)! (rei-1)! (n„i+l)! (»„+!)!%! »/! nj nj. (nj )^ ... (»,!)*’ 

(100.29) 

which, is easily seen to reduce to the final result 

I Wl® = IA+41X»z(l+»J(l+»n)- (100.30) 

This is the quantity which will determine in the Einstein-Bose case the 
probability of coUisional transitions in which a pair of particles leave 
the elementary states & and I to appear in the states m and n. 

By comparison with (100.18), it will be seen that the two formulae 
could be expressed in the combined form 

= !Ii±l2l^^ni(l±nJ(l±nJ. (100.31) 

where the upper signs apply to the Einstein-Bose and the lower to the 
Fermi-Dirac case, and the symbols %, % and n^, denote the numbers 
of particles originally present in the states from and to which transition 
occurs. In the Einstein-Bose case there would be no limit on the num- 
ber of particles in such states, but in the Fermi-Dirac case the possible 
values would be limited to 0 and 1. In both cases the probability of 
transition would be proportional to the numbers of particles in the 
states from which transition occurs. In the Einstein-Bose case the 
transition would be favoured by particles already present in the states 
to which transition is to occur, but in the Fermi-Dirac case transition 
would be prevented if either of these states were already occupied.f 
(c) Time-proportional collision probabilities. Having made the fore- 
going necessary investigations of the pertinent matrix elements, we are 
now ready to study the probabilities for different specified kinds of 
collision to take place in a Fermi-Dirac or in an Einstein-Bose gas. 
We shall first have to decide on a reasonable procedure for specifying 
diffemnt kinds of collision in a way that would correspond to con- 
ceptual observational possibilities. With such a specification we shall 
then find — ^at least in first approximation — ^that the probability for any 
given kind of collision to occur would be proportional to the time. 

In order to see what is involved in makmg a reasonable specification 
of a given kind of collision we must first pay attention to the fact that 
the individual eigenfunctions %($) for a single 'particle in a container 

t For the method of generalization of these results to ternary and higher order inter- 
actions see Jordan, Zeits.f, Phys, 45, 766 (1927), 



§ 100 TRANSITION BY MOLECULAR COLLISION 445 

of any considerable size would correspond — except for the very lowest 
energies — ^to a practically continuous spectrum of unperturbed energies 
Hence, in a finite time of observation t, such as we should be willing 
to allow for studying the initial condition of the system before the 
probability of transition itself became too large, it is evident, in agree- 
ment with the discussion of §98, that we could not determine the 
precise eigenstates occupied by the molecules since this would give a 
more precise knowledge of their energy than would be allowable in 
accordance with the Heisenberg uncertainty principle in the form 

tAc ^ h. (100.32) 

For this reason it is not feasible to specify the different kinds of colli- 
sions taking place in a gas by stating the exact states I and n 
from and to which transition takes place. 

To meet this consideration we may proceed by treating our con- 
tinuous spectrum of states for a single particle as divided into groups 
of states each corresponding to the same energy range Ac, which would 
be allowed by the Heisenberg relation (100.32), for the time of observa- 
tion. We may use Greek letters ic, A, ft, v,... to designate these different 
groups of states in place of our previous Latin letters for the precise 
states, and may take the Veight’ or number of individual precise states 
k, I, m, n,.„ in the different groups of energy width Ac as designated 
by the symbols g^, g^, . The condition of the system can now 

be specified in a manner corresponding with observational possibilities 
by giving the numbers of particles n^, n^,,.,. in these different 

groups of states, and the different possible kinds of collision can be 
specified by stating the groups of states, say fc, A and ft, v, from and to 
which transition takes place. 

We now turn to the computation of the probabilities for such transi- 
tions to take place. For this purpose it will be desirable to carry through 
once more a complete calculation for the specific case under considera- 
tion, rather than to try to substitute into our previous formulae for 
probabilities of transition. 

We may take the initial condition of the gas at ^ = 0 as one in which 
there are nx, n^, and particles in the indicated groups of states, 
and take the transition as specified by the formalism 

nx nx — 1 -> Ti^-j-l. (100.33) 

The numbers of particles present at ^ = 0 in states other than those 
in the groups k. A, ft, and v need not be specified since they will not be 



juTOiyea m tjie computation, xne meinoa oi vanation oi constants 
must now be applied to calculate the probability of fi n di n g at a later 
time t — t that the transition (100.33) has occurred. 

In order to apply the method of variation of constants, which deals 
with precisely defined states, we shall use the letters i and/ to designate 
precise states of the gas which woidd agree respectively with the initial 
and final conditions of the gas, which were more broadly specified above 
by the occupation numbers 

and (w„— 1,11;^— 

for the groups of particle eigenstates of interest. Employing the sym- 
bols Cf(i) and for the corresponding probability coefficients, we shall 
then take unit probability at time i = 0, 

;^Fi(0) = 2c?(0)c,(0) = l, (100.34) 

i i 


for finding the system in one or another of the states i which correspond 
to the specified initial condition of the gas, and zero probability, i.e. 

C/(0) = 0, (100.36) 

for finding the system in any state / corresponding to the final con- 
dition of interest. 

We must now compute the probability for finding the system in one or 
another of the states / at a later time t = t. Making use of the approxi- 
mate integration of Schroedinger’s equation (96.9), we can write for the 
value at time t of the probability coefficient for any particular state/ 






Ci(0), 


(100.36) 


where is the element of the perturbation matrix corresponding to the 
transition from i to f, and the summation is taken over all the initial 
states i. Multiplying by the appropriate similar expression for the com- 
plex conjugate quantity, 

<V*W = 2 «0). (lM-3'') 

i' i' f 

and s ummin g over all states / which agree with the final condition of 
interest, we then obtain 






/ i' 


E^.-E} 


M- 


■E} 


c?(0)Ci(0) (100.38) 


as an expression for the desired probability for finding the system in 
the final condition of interest at time t = t. 



To obtain an exact evaluation of this expression it would be necessai 
to have some precise assignmont of values for the coefficients c^{0) ar 
cJI(O). This, however, we do not have, since our only knowledge ' 
these initial values is merely given by the fact that the total probabilil 
for finding the system in one or another of the states i must be unit; 
as shown by (100.34), and this can be satisfied in many different way 
Hence, as in our previous treatment of similar cases in §99, we mu, 
now resort to methods of an essentially statistical mechanical charact 
and try to find what we might expect on the average. To do this -vi 
may calculate the mean behaviour in an appropriate representath 
ensemble, set up in agreement with the postulate of equal a prio 
probabilities and random a priori phases— see (98.7)— so as to have tl 
initial density matrix, at time i = 0, 


9ii' — ®*(0)Ci(0) = 


— - (state i in group OA, 

O (state i not in group Of), 


where is the total number of precise states that correspond to oi 
specified initial condition. Combining with (100.38), we then obtaii 
after reduction to trigonometric form, 

•P/(*) = ^ 2 ^ {E^—Eff (100.4( 


as an expression for the probability of finding a system in the ensemble i 
the group of Qf states that correspond to the final condition of interes 
where the summations are over all states i and/ in the two groups. 


To treat this expression let us begin by considering a particuls 
initial state i, which may be defined by giving the numbers of particlt 
%, % in each of the gx-> S'v individual eigenstates Tc, I, m, 

which go to make up the groups of interest. Any final state / associate 
with this initial state may then be specified by giving the particuls 
states h, i and m, » involved in the transition. Hence the summatio 
over / corresponds to a quadruple summation over h, I, m, n. Intrc 


ducing the expression for terms of %, % and given b 

(100.31), replacing (Ej-I12) by (««+€»-£*-«{), and re-expressing th 
summation over/, we then obtain 

. sin2^(e„-(-6,i-6j-e,)i 

i (100.4] 



448 CHANGE IN QUANTITM MECHANICAL SYSTEMS WITH TIME Chap. XI 


where the upper signs are for the Einstein-Bose case and the lower ones 
for the Eermi-Dirao case. 

To simplify this expression we must introduce further approxima- 
tions. We take the ‘direct and exchange integrals’ and as being 
practically the same for any set of individual states picked from the 
groups K, X, ii, v; and we assign to %, n^, for each individual 

eigenstate the mean values which they have for their respective groups : 


% = 


n, = 


_ 


»ii 


n„. = 




(100.42) 


9k 9x 9(1 9v 

This then permits us to take the factors containing these quantities 
outside the summation over 1c, I, m, and n, and to regard the summation 
over all initial states i as cancelled by the factor IfG^, which is the 
reciprocal of the total number of initial states. We then obtain 


(100.43) 

The summation still remaining in this expression may now be 
evaluated by introducing an approximate integration in the familiar 
manner already employed in the preceding section. Considering the 
summation over n as to be taken first, we can introduce 


P/«) = 4|Ji±J2|2^p l±Ji 1±^ T 
9k 9x\ 9ti/\ 9y} 




dn = (100.44) 

as an expression for the number of eigenstates of the kind n in the 
range sinee is by definition the number in the interval Ae deter- 
mined by our time of observation t. Replacing summation by integra- 
tion, we can then write for a given time t 


2 

kfl,m,n 


«+A« sin2-(e„-f €„-Cfc-6,)i 

=S‘2 J ■ ( 100 . 45 ) 




In carrying out the integration indicated in this expression, €„ is to 
be treated as a variable and e„^, ej^, ej, and t are to be treated as para- 
meters. On account of the presence of the resonance denominator 
it will then be seen for sufficiently large values of t 



TBANSITION BY MOLECULAR COLLISION 


44d 


§ 100 


that important contributions to the integral will be made only when 
we have the approximate relation 

(100.46) 


This will have two consequences. In the first place it will restrict the 

possibilities for appreciable transition to cases where the energies for 

the groups of states k, A, /t, v satisfy within small limits the equation 

of conservation . , 

= 6;c4-eA. (100.47) 

And in the second place it will permit us to substitute for the integral 
occurring in (100.45) the value ir®/A, which it would have if the limits 
of integration for the variable (e„+e^— c*— ei)i were taken as extending 
all the way from minus to plus infinity. In place of (100.45) we can 
then write 

• TT 

n n n n 2 

= ILt, (100.48) 


2 





Ae h 


where the factors g^, g^ are introduced by the later summation over 
those states. 

Substituting (100.48) into (100.43), we then obtain for the desired 
expression for the probability of finding a system in the ensemble at 
time t in the final condition of interest 


^ (100.49) 


Thus in first approximation, for times t which are neither too long nor 
too short, we do have that the probability for finding a system in the 
final condition specified will grow proportionally to the time t. 

Taking as the average frequency per unit time which we can 

expect for collisions iu which a pair of particles are thrown from regions 
/c, A to ju, V, we can also write our result in the form 

(100.50) 

where is introduced as an abbreviation for the factors containing 
the integrals and /g whose values depend on the particular regions 
/c, A and fi^vm the manner given by the equations of definition (100.14) 
and (100.15). The upper and lower signs in this expression refer re- 
spectively to the Einstein-Bose and Eermi-Dirac cases. In using this 
expression, it should be noted, in accordance with (100.47), that it 
applies to cases satisfying the relation 

3M 


3596.25 


(100.61) 



450 CHANGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 

and that probabilities of transition for cases where this is not satisfied 
can be taken as practically zero. 

Owing to the Hennitian character of the operators involved in 
obtaining the integrals I-^ and Jg, it will also be seen that we can write 
the very important relation 

^iiv,Kk — -^kKiiv (100.52) 

for the factor which would determine the frequency of collisions that 
are the inverse of those which we have been considering; i.e. collisions 
in which particles are thrown from fi, v to k, X instead of from k, A 
to [l, V. 

It may be felt that the lengthy investigation of collision probabilities 
which has been given in the three parts of the present section is hardly 
justified, in view of the many approximations which have been intro- 
duced. These include not only the usual kind of approximation em- 
ployed in connexion with the method of variation of constants but, in 
addition, further approximations nec^sitated by the special nature of 
the application. Nevertheless, we shall be able to obtain a derivation 
of the quantum mechanical fT-theorem with the help of the above 
results which will be the analogue of Boltzmann’s original derivation 
of that theorem based on the classical behaviour of colliding molecules. 
This will give us insight into the physical character of statistical 
mechanics. Our final derivation of the N-theorem will be based on a 
general treatment of changes in quantum mechanical ensembles with 
time made with the help of the exact integration of the Schroedinger 
equation. To this treatment we now turn. 

.101. General treatment of changes in ensembles with time 

We may now bring the chapter to a close by giving a general and 
formally precise treatment to temporal changes in ensembles, which 
would hold in any quantum mechanical language as well as in that 
provided by imperturbed energy eigenstates, and which would be valid 
over a long as well as over a short interval of time. We shall be specially 
interested in the probability at time t = tioi finding a member of the 
ensemble in any particular state n, when the state of the ensemble 
itself has been specified at some ioitial time i = 0. We can base the 
discussion on the exact integration of the generalized Schroedinger 
equation as given in § 96. 

If we consider any single system, we have seen, (96.37), that the 
probability amplitude »„(<) for any state n of the system at time t = t 



§ 101 GENERAL TREATMENT FOR ENSEMBLES 45l 

can be calculated from the probability amplitudes a;r.{0) for the various 
states h at time ^ = 0 by the expression 

«n(«) = I (101.1) 

where the are matrix elements corresponding to the operator U(i) 
and to the language provided by the states k. With the help of this 
integration of the generalized Sohroedinger equation we have then seen, 
(96.40), that the probability for finding this system in state n at time t 
would be given by 

<(<)««(«) = 1 I C^«ftl"4(0)a;,(0)+ T C^„fcaf(0)aj.(0), (101.2) 

k 

where the first term is summed over all states h and the second term 
is summed over all states I and all states k which are not the same. 

Hence, if we now turn our interest fi*om a single system to an en- 
semble of systems, it is evident, fi:om the significance of the density 
matrix and from its original definition by 


Pnm — (101.3) 

as a mean over all members of the ensemble, that we can write 

Pnnit) = I |C^„ftlV;fcft(0)+ I U„„p^{0) (101.4) 

k l^K 


as an expression for the probability />„„(<) of finiling a member of the 
ensemble in state n at time t, in terms of the initial probabilities /)i.i.(0) 
for such states in the ensemble and in terms of the initial non-diagonal 
elements for the density matrix p^O). 

This expression assumes a specially simple form if we now regard the 
ensemble as having been originally set up to represent the results of an 
initial observation made on a system of interest at time t=0. Taking 
this observation as being a measurement of some quantity F of which 
the states k are characteristic, we have then seen, especially from the 
discussion in § 98 (d), that the fundamental postulate of statistical 
mechanics will lead us to the initial assignment of equal probabilities 
and random phases to those states k which agree equally well with the 
observation. Hence, in accordance with the assignment of mnAom 
phases (see §§ 83 (a) and 84), the non-diagonal terms of the initial matrix 


will be zero. 


PW = 0 (Z 7^: k). 


( 101 . 6 ) 


and the expression for the probability of findi ng a member of the 
ensemble in state n at time t will reduce to the simple form 


PnJf) = J |C^nfcl®Pi:*(0). 


( 101 . 6 ) 



452 CHAjSTGE IN QUANTUM MECHANICAL SYSTEMS WITH TIME Chap. XI 


In making statistical mechanical applications of this result, our ulti- 
mate interest will ordinarily lie in its relation to the probability P^{t) 
of finding a member of the ensemble in a group of Oy states n between 
which our approximate observations do not distinguish. We shall find 
the result very important for our final general derivation of the 
quantum mechanical ff-theorem in the next chapter. 

The validity of the special form of expression given by (101.6) is of 
course dependent on the assumption of random phases for the proba- 
bility amplitudes ajjfi) for the various states h at the initial time 
f = 0, In this connexion it is of interest to examine what happens to 
the original randomness of phase as time proceeds. For this purpose 
we may use (101.1) to give 

Pnm(i) = <(iK(t) = I (101.7) 

as a. general expression for the density matrix at the later time t = t 
If the original phases of the states Jk, I were random, this would then 

PnJt) = I Cl^te(O), (101.8) 

k 

which would not m general reduce to zero for n not equal to m. Hence 
we must conclude that the initial randomness of phase would in general 
be lost as time proceeds. As we shall see in the next chapter, however, 
this can be regarded as compensated by an increased randomness of 
distribution over possible states. 

We are now ready to apply the results of this chapter to the derivation 
of the £f-theorem. 



XII 

THE QUANTUM MECHANICAL H-THEOREM 

A. DERIVATION OF THEOREM 


102. Definition of S for a gas 

We are now ready to proceed to a discussion of the quantum 
mechanical jff -theorem. For this purpose we shall first adopt a point 
of view similar to the original one of Boltzmann, defining a quantity 
If which directly characterizes the condition of a gas, and showing the 
tendency for this quantity to decrease with time towards an equilibrium 
value as a result of molecular collisions. We shall then turn to a mor£ 
general point of view similar to that of Gibbs, de fining a quantity If 
which would characterize the condition of the representative ensemble 
for any system of interest, and showing the general tendency for this 
quantity to decrease with time towards a steady value where the 
ensemble would give a suitable representation for the condition of 
equilibrium. The order of treatment will thus be the same as that of the 
classical discussion in Chapter VI. We shall now regard it as desirable, 
however, to give only a brief consideration to the effect of collisions in 
changing the quantity If for a given sample of gas, and devote most 
of our attention to the more general behaviour of the quantity 3 for 
a representative ensemble of systems and to the conclusions that can 
be drawn therefrom. 

We must begin by defining a suitable quantum mechanical quantity 
If that would characterize the condition of a sample of gas. In agree- 
ment with the classical expression (47.9), which relates the quantity If 
to the number of states G — defined by equal volumes in the phase space — 
that correspond to the specified condition of the gas, it also proves 
profitable in the quantum statistics to introduce a similar definition 

, If = -log G, (102.1) 

where G now denotes the number of quantum mechanical states of the 
gas that correspond to its specified condition. 

In order to give specific content to this expression, let us now con- 
sider a sample of gas consisting of n similar particles enclosed in a 
container of volume v. In accordance with our previous discussion in 
§100 (c), the condition of such a gas could be appropriately specified 
by regarding the eigenstates of energy for a single particle in the con- 
tainer as divided up into groups of gr^ neighbouring states, with a range 



454 


THE QUANTUM MECHANICAL H-THEOBBM Chap. XII 


in energy Ac related to the time available for observation, and by then 
stating the number of particles assigned to each such group #c. For 
the number of unperturbed energy eigenstates G for the gas as a whole, 
that would correspond to such a specified condition, we could then 
write, in agreement with (87.14) and (87.16), in the case of an Einstein- 

and in the case of a Eermi-Dirac gas 




( 102 . 2 ) 


(102.3) 


where we take products JJ over all groups of elementary eigenstates k. 
Taking the logarithms of these quantities, using the Stirling approxima- 
tion for factorial numbers as m § 88, and substituting in (102.1), we can 
then write the desired explicit expressions for H in the combined form 
^ = 2 K log «/c- (»K±?if)log(y„±»*) ±9^ log S' J, ( 1 02.4) 

K 


where the upper signs refer to the Einstein-Bose and the lower signs 
to the Eermi-Dirao case. 

These expressions agree with those origmaUy introduced in this con- 
nexion for the two cases respectively by Einsteinf and by Fermi. J The 
expressions are not precise for those groups of states k where the use 
made of the Stirling approximation is unjustified. This is not important, 
however, since we shall use the expressions to obtain an insight into 
the special mechanism by which the decrease in H with time takes 
place in gases as a result of molecular collisions, and shall give more 
rigorous treatment to the generalized J?-theorem later. 

At the correspondence principle limit, where the conclusions drawn 
from the quantum mechanics should approach the classical ones, the 
above expressions come into agreement with Boltzmann’s expression 
for H. The attainment of this limit will be favoured by high dilution, 
large particle mass, and large energy for the gas as a whole; and these 
factors will have the general effect of making the numbers of particles 
in most of the groups of states k small compared with the numbers 
of elementary states in the group. The expressions given by (102.4) 
thenapproaohll s = 2Klog«.-».logP.). (102.5) 

K 


t Einstein, BerL Ber. 1926, p. 3, equation (29 a). 
t Eenni, Zeiis.f. Phya. 36, 903 (1926), equation (10). 

II This approximation is not close enough for a correct oeJculation of the Sackur- 
Tetrode expression for the entropy of monatomic gases. See § 136 (6). 



DEFINITION OF H 


455 


§ 102 

This agrees, in the manner to be expected, with the classical expression 
(47.7) for Boltzmann’s Hy 

S = '^{ni log ni—Tii log (102.6) 

provided we express volumes in the /t-space in terms of units of 
magnitude h', where r is the number of degrees of freedom for the kind 
of molecule involved. 

103. Change of H with time as a result of collisions 
We must now consider the effect of collisions in leading to a decrease 
with time iu the value of the above defined expression for the quantity 
H.'f In accordance with our previous investigation of collisions in 
§ 100, with the help of statistical methods we have seen in the case of 
a sample of gas in a condition specified by taking n^, nx, n^, n^, etc., 
as the numbers of particles in different possible groups of g^, gx, g,, 
etc., elementary states, that we can take (100.50), 

(103.1) 

as a reasonable expression for the expected number of collisions per unit 
time in which a pair of particles would be thrown from groups k, A to 
[i, V. The upper signs in this expression refer to the Einstein-Bose and 
the lower to the Eermi-Dirac case. The quantity is a coefficient 

the equaUty = (103.2) 

on account of its relation to the elements of an Hermitian perturbation 
matrix. And the value of this coefficient is practically zero except for 
collisions which satisfy in close approximation the energy relation 

= e*+ex. (103.3) 

In agreement with (103.1), we may now write, as an appropriate expres- 
sion for the rate of change in the number of particles in group k, 

dvh 

^ = - , 2 , nx{gi,±n^){gy±ny)+ 

+ , 2 (103.4) 

where we sum over all groups A and over all pairs of groups (jw), and 
make a double inclusion of those terms in the summation for which 
X = K. 

With the help of this expression we can now investigate the rate 
of change with time in the quantity H, which itself depends on the 

t The treatment follows that of Pauli, Sommerfeld Festschrift^ Leipzig, 1928, p. 30. 



456 THE QUANTUM MECHANICAL H-THEOREM Chap. XH 

numbers of particles in the different groups. Starting with the 
explicit expression (102.4) which we have given, 

= 2 Klog»*-K±9'Jlog(gr,,±»J±Sr^loggrJ, (103.5) 

K 

and differentiating with respect to the time, we obtain 

^ = 2 (103.6) 

K 

Substituting (103.4), this then gives us 

+ 22 »p(y«±»«){?A±»»A)log , (103.7) 

where the summations are over all groups k and A, and over all pairs 
of groups (fu'). Changing to a summation over all pairs of groups (kA), 
this then becomes 


X fe±C)to±»>) - 

And taking the arithmetical mean of this expression with the equivalent 
result obtained by interchanging the pair of indices (/cA) with the pair 
(fiv), we finally obtain 

_1 ^ 
dt~ ^ ^ 


'2 2 "^«A.pv»«»A(y,c±»^)(S'v±»v)X 


X 


(log ^ 


»A 


-log, 




1 V 

~2 2 "^/*«'.«A»V”>'(?'c±®'c)(9'A±»a)X 


(/cA),(/tv) 


To proceed farther, as an essential step we must now introduce the 
equality between the coefficients for inverse coUisionB 

-^/iv,itA ~ -^kK^v 


(103.10) 



§ 103 


CHANGE OF H BY COLLISIONS 


457 


which has already been mentioned above. Substituting in (103.9), we 
then obtain, after some reorganization, the desired form of expression 

(/fA),(/xv) 


xlog 


7iK^A(?u±?^u)(fl'v±^v) 


(103.11) 


'ni,n,{g^±n^)(gx±n^ 

This result, however, has a value which can only be zero or negative. 
To see this, we note for any term in the indicated summation that the 
factor could not be negative on account of its physical signi- 
ficance, and that expressions of the form could 

also not be negative, since even in the Fermi-Dirac case — ^where the 
negative sign appears — ^we should have to have gr^ > on account of 
the Pauli exclusion principle. Thus the various terms in the summation 
will aU be of the form 

4(a;-y)log-, (103.12) 

y 

with none of the quantities x, y negative. As a consequence, these 
terms will themselves be quantities which can only assume zero or 
positive values. Hence, noting the negative sign for the whole summa- 
tion in (103.11), we see that this expression for the expected rate of 
change in H must satisfy the relation 

^ < 0. (103.13) 

Cut 

Furthermore, as the necessary and sufficient condition for a null value 
of this rate of change, it is readily seen that it would be necessary to 

log— — [-log =log— ^ [-log (103.14) 

for all collisions for which has an appreciable value, and hence, 
in general, in accordance with the discussion of § 100 (c), for all collisions 
approximately satisfying the energy relation 

(103.16) 

already mentioned above. As the simultaneous solution of these two 
equations, we then obtain an expression of the form 


log- 




■4-a+^e„ = 0, 


, (103.16) 

9k±‘»'k 

where a and jS are constants independent of k. Or, solving for n^, 
can also be written in the more familiar form 


this 


= 


3S95.25 


9 k 

SN 


(103.17) 



458 


THE QUANTUM MECHANICAL H-THEOREM Chap. XII 


in agreement with our previous expressions (89.6) and (89.7) for the 
distributions prevailing at equilibrium in the case of Einstein-Bose andr 
Eermi-Dirac gases. 

We thus indeed find, also in the quantum mechanics, that we can 
define a suitable quantity H to characterize the condition of a gas 
which will have the desired property of exhibiting a tendency to 
decrease with time as a result of collisions, unless the observed distribu- 
tion of the molecules is already such as to satisfy the accepted expres- 
sion for equilibrium. No elaborate discussion of this finding will be 
necessary, since the detailed discussion of the classical ff-theorem, given 
in the various parts of § 49, can be readily adapted to the present 
quantum mechanical analogue. The main points of that discussion may 
again be mentioned, and it wiU be seen on reflection that they would 
apply also in the quantum mechanical case. 

The i?-theorem is a principle, having a statistical rather than an 
exact character, which makes it a probable rather than a certain pre- 
diction that H will decrease with time as a result of collisions when 
a gas is observed to be in a condition differing from equilibrium. The 
theorem is such as to make it plausible to assume that continued 
observations would show a continued decrease in H with time towards 
a TniTiiTmiTn equilibrium value. The theorem relates to the changes that 
may be expected in the condition rather than in the precise state of 
a gas; and the conclusion of statistical mechanics, as to a preferred 
direction of change in conditions of the system, stands in no conflict 
with the conclusion of exact mechanics, as to dynamical reversibility, 
which also in the quantum mechanics (§ 95) makes it possible for a 
system to pass iu mther direction through a series of precise staies. As 
a consequence of the theorem, we can expect the final behaviour of an 
isolated gas to consist in a succession of fluctuations in the value of H 
around its mlnimiim, with large fluctuations therefrom occurring very 
infrequently. 

In concluding this discussion it is of interest to note that the fore- 
going quantum mechanical derivation of the ff-theorem is in one 
respect made very simple by the possibility of introducing a direct 
equality (103.10) between the probability coefficients for inverse colli- 
sions. This equality arises, as we have seen in § 100 (c), on the one hand 
from the statistical assumptions involved in our treatment of the 
groups of states between which transitions are taken as occurring, and 
on the other hand from the Hermitian character of the perturbation 
operator corresponding to the energy of interaction between the particles. 



§ 103 


CHANaE OP E BY COLLISIONS 


459 


Analogotis sioiple methods of treatment would, be possible also in the 
case of molecules more complicated than particles. 

In the corresponding classical treatment of the if-theorem as given 
in §48(6), it was not possible to proceed in general in such a simple 
manner, since the more complete specification which is given to mole- 
cular states in the classical mechanics rules out the existence of inverse 
collisions except in the special ease of spherical molecules. This made 
it necessary to base the general classical derivation on a somewhat 
complicated consideration of Boltzmann’s closed cycles of correspond- 
ing collisions. We shall return to a discussion of questions related to 
this difference in § 116. 

104. Definition of M for a representative ensemble of systems 
(a) Fine-grained and coarse-grained probabilities in the quantum 
mechanics. We are now ready to turn to more powerfiil methods of 
treating the approach of quantum mechanical systems towards equili- 
brium which are analogous to the methods introduced into the classical 
statistics for this purpose by Gibbs. We may begin by defining a suitable 
quantity 5 which would characterize the condition of the representative 
ensemble for any system of interest. 

In the quantum mechanics, as in the classical mechanics, the defini- 
tion of the quantity S for a representative ensemble depends on a 
distinctiou between what may be called the fine-grained and coarse- 
grained probabilities for the different possible states of members of the 
ensemble. The necessity for such a distinction depends, as before, on 
the circumstance that our actual interest in statistical mechanics lies 
ordinarily in conditions of a system which are not sufficiently well 
known for a precise specification of state. We must first discuss the 
character of the two kinds of probabilities. 

Let us consider an ensemble in a state which is described by the 
elements of the density matrix 

Pnm ~ (104.1) 

where the quantities a^ and a„ denote probability amplitudes for the 
various states of interest m, n, etc., for a single system, and we take 
a mean of the indicated product over all members of the ensemble as 
indicated by the double bar. As the probability of finding a member 
of the ensemble in a particular state n, we can then write 

Pnn = K=‘ (104.2) 

This may be called the fine-grained probability for the state n since it 



460 


THE QUANTUM MECHANICAL H-THEOREM Chap. XII 


gives the probability for finding that state if precise observations are 
made on the different states n of the members of the ensemble. 

In the situations of interest for statistical mechanics, however, we 
regard ourselves as making observations of only limited accuracy, which 
are in general insufficient to distinguish between neighbouring states n 
having similar properties. Hence, if we consider a group of Oy states 
n between which our measurements do not distinguish, we shall also 
be interested in the total probability P„ of find i n g a member of the 
ensemble in this group. For this we can evidently write 

■Pv = !/>«.. ( 104 . 3 ) 

n=l 

where we take a sum over all states n in the group. In terms of this 
expression, for the probability of the whole group of Qy states, we may 
then write o. 


as an equation defining the coarse-grained probability for the states n in 
the group of Oy states. 

As in the classical mechanics, we thus define the coarse-grained proba- 
bility for a state as the mean of fine-grained probabilitieB taken over 
neighbouring states of nearly identical properties. And it will be seen 
that our present fine-grained and coarse-grained probabilities, />„„ and 
Pa for a quantum mechanical state n, are indeed the natural analogues 
for our previous probabilities, pdqj ^ ... dpf and Pdqi... dpf for a classical 
state defined by the infinitesimal region dpf in the phase space. 

The quantum mechanical language, to be used in setting up our 
coarse-grained probabilities i^, and the range and character of states n, 
to be used in taking the mean, wiH, of course, be determined by the 
nature of the observational processes that we have in view. Both the 
fine-grained and the coarse-grained probabilities will be regarded as 


normalized to unity with 



( 104 . 6 ) 


and 



( 104 . 6 ) 


where we take summations o^er all possible states h. 

(6) General egression for 'B. In terms of such coarse-grained proba- 
bilities we may now give a general definition of the quantity E for 
Ml ensemble, by the summation 



( 104 . 7 ) 



§ 104 


DEFINITION OF H 


461 


taken over all possible states Te. As the work of the present chapter 
proceeds, we shall find that this definition does give us a quantity which 
can be regarded as measuring deviation from equilibrium and which 
tends to decrease with time to an equilibrium value. 

The defining expression given by (104.7) can also be written in other 
forms that prove useful. 

Since log will evidently have the same value for aU states h that 
lie in each particular group of states for which jf* is the mean of 
pjft, and since the summation of or pj^y. over such a group would give 
the same result, we can also write (104.7) in the form 




(104.8) 


We thus see that S can be regarded as the mean value in the ensemble 
of logi^ for all states k. The symbolism that we have adopted thus 
agrees with our general convention of using a double bar to indicate 
a mean value for an ensemble as a whole. 

Noting the relation (104.4), between the total probability for a 
group of states and the mean probability for the members of the 
group, it will be seen that we can also rewrite our defining expression 
(104.7) in the form 



(104.9) 


where the final result arises from the circumstance that the summation 
over k = \%ok= would consist in the addition of 0^ e^ual terms. 

In closing these general remarks on the definition of H, we may 
emphasize the distiuction between the two different quantities 

and |;P*logJl., (104.10) 

the latter being the one by which S is actually defined.t Only in special 
cases will a representative ensemble be in suck a state that the two 
quantities are equal (e.g. an ensemble representing a measurement 
which has just been made, see § 106 (a), equation (106.4)). 


t The definition of W given above is in agreement TPith that of Pauli, Sommerfdd 
Festschrift, Leipzig, 1928, p. 30. We have, however, perhaps laid more emphasis on the 
circumstances that the quantity is defined in terms of coarse-grained probabilities, and 
is primarily eharaoteristie of the condition of a representative ensemble for a system 
of interest rather than immediately characteristio of the system itself. The definition 
does not agree, on the other hand, with that of Klein, Zeits. /. Fhya. 72, 767 (1931), 
who uses the first of the two quantities (104.10) and thus fails to give clearly appropriate 
recognition to the main function of statistical mechanics in providing a treatment for 
the behaviour of systems in conditions that are not sufficiently defined for a precise 
specification of state. See § 106 (c). 



462 


THE QUANTUM MECHANICAL H-THEOREM Chap. XII 


It is also of interest to note that our present quantum mechanical 
definition of the quantity M by ^ indeed the natural ana- 

logue of our previous classical definition by J ... J Flog F dq^ ... dpf. 

(c) Relation between E and H. It will now be informing to examine 
the relation between our present quantity H, which characterizes the 
condition of the representative ensemble for a system of interest, and 
our previous quantity H, which more directly characterizes the condi- 
tion of a single system of the kin d under discussion. In accordance 
with (104.9), we can write the definition of S' in the form 

K K 


where F^ is the probability of finding a member of the ensemble in an 
observed condition k, and is the number of quantum mechanical 
states for a single system which would correspond to this condition. 
On the other hand, in accordance with (102.1), we can write the defini- 
tion of S in the form (104 12) 


where is the number of quantum mechanical states that correspond 
to a condition k actually specified for a single system. Combining the 
two expressions, we may then write 

S = 2P,S,4-2-P.logP, 

K K 

= S^st+^P>sP.> (104.13) 

K 

where we use the symbol Eg^ to denote the mean value, for all mem- 
bers of the ensemble, of the quantity H applying in the original form 
of if-theorem to a single system. For the special case of a system which 
has been observed to be in a given condition, the above relation would 
reduce at the instant of observation to 

E = H, (104.14) 

since F^ would then be unity for the particular condition k that had 
been found and otherwise zero. 

Equations (104.13) and (104.14) are the quantum mechanical ana- 
logues of our previous classical equations (51.33) and (61.34). They will 
permit similar views in the quantum as in the classical mechanics, con- 
eeming the relation between the original form of the if-theorem as 
already discussed in § 103, and the generalized form of that theorem 
to the dise^Ission of which we now turn. 



105. Change of H with time by the method of transition proba- 
bilities 

We are now ready to consider a derivation of the generalized H- 
theorem, in which we shall use our previous Results as to transition 
probabilities to demonstrate the tendency for H to decrease with time 
in the case of an ensemble which has been set up to represent some 
system of interest. For this purpose we shall assume that we are going 
to be concerned with approximate observations on the system of in- 
terest which can be correlated with groups of unperturbed energy 
eigenstates k which correspond to some unperturbed Hamiltonian H® 
for the system. In accordance with the character of our interest, and 
in agreement with (104.9), we can then take 

E = ^P,\og^ (106.1) 

K ^ 

as an appropriate expression for the value of S' in oiir representative 
ensemble, where JP^ is the probability at some time of interest of finding 
a member of the ensemble in a specified condition k corresponding to 
such a group of unperturbed states. Differentiating (105.1) with 
respect to the tin^ t, we then obtain as a general expression for the 
rate of change in H with time 

= 2 aogii-Iog (10S.2) 

K 

where the second form of writing is justified by the consideration that 
the total probability of finding a member of the ensemble in one or 
another condition k cannot change with time. 

To make a specific application of this general expression, let now 
consider that we are interested in determining the value of dSjdt at 
an initial time when the ensemble is set up to represent an initial 
observation on the condition of the system of interest. In accordance 
with our fundamental hypothesis as to a priori probabilities and phases, 
the initial state of the ensemble will then be specified in such a manner 
as to give probabilities for the different states of its members that agree 
with the information made available by the observation, and also in 
particular to give random phases for all the states that are represented. 
Thus the ensemble will be in a state to satisfy the conditions necessary 



464 


THE QUANTUM MECHANICAL H-THEOREM Chap. XII 


for applying the general formulation given to transition probabilities in 
§ 99 (d), and we can take our previous expressions (99.42) and (99.43), 

Z^ = A^Q,P, (105.3) 

and <?ic-Pv» (105.4) 

as giving those contributions to the total rates of change in and 
which are to be ascribed respectively to transitions in the ensemble 
from condition /c to v and from condition v to «. 

This will then let us write 




,G^P^ A„^ Gjf Pp), 


(106.5) 


where we take a summation over all conditions v, as an expression for 
the rate of change in P^ with time. Substituting in (105.2), this now 
gives us _ p 

^ - 2 Gf,P,-A,, G,P,)\og^ (106.6) 

#f,V ^ 

for the rate of change in S. Furthermore, by taking the arithmetical 
mean of this expression with the equivalent result obtained by inter- 
changing the indices k and v, we can write 

§ = (106.7) 


To proceed farther, as am, esseniial step we must now introduce the 
equality between the coefficients for inverse transitions 

A^ = A,^, (105.8) 

which we have found in § 99 as a consequence of the Hermitian charac- 
ter of perturbation operators and of our statistical assumptions. Sub- 
stituting above, we can then rewrite (106.7) in the form 

-^=-2 2^'" G'„ff,|^-^j|log^-log5j. (106.9) 

Since the factor cannot be negative merely on grounds of 

physical significance, and since the remaining combination of factors 
cannot be n^ative on the combined grounds of significance and form 
(see remarks made in connexion with (48.13)), we are then led to the 
expected result 

for the rate of change in E with time. 


(106.10) 



§ 106 CHANGE OF H BY TRANSITIONS 465 

Furthermore, as the necessary and sufficient requirement for a null 
value of this rate of change, it will he seen that it would be necessary 

to have t > -p 

^ ^ = const. (105.11) 

ijTy 

for all conditions k, v, etc., that are represented in the ensemble or that 
could be reached from those present. Or, making use of the defimtion 
of coarse-grained probability in terms of the total probability Py for 
a group of Oy states, as given by (104.4), we can also take the uni- 
formity of coarse-grained probability 

P^^ = const. (106.12) 

for the unperturbed eigenstates n corresponding to such conditions as 
the requirement for equilibrium. In interpreting this result it is to be 
remembered, in accordance with the method of transition probabilities, 
that the different possible states n are all regarded as having unper- 
turbed energies that lie within a given range E to E-\-^E having 
a width determined by the character of our approximate observations. 
K will be noted that the above requirement for a stationary value of 
S is also the requirement for a minimum value of that qu^tity. 

We thus find that we can predict a decreasing value of 5 in the case 
of an ensemble set up to represent an initial approximate observation 
on the unperturbed energy eigenstates of a system of interest. 
also note that the requirement for a minimum equilibrium value of 
as given above by P^ = const., would be satisfied with our previous 
choice of the microcanonical ensemble to represent a system in a steady 
condition of equilibrium. 

The foregoing simple and satisfactory derivation of the generalized 
quantum mechanical jEf-theorem is originally due to Pauli, f The deriva- 
tion is very informing in giving a real insight into the mech an is m by 
which the approach of quantum mechanical systems towards equih- 
brium takes place. 

The derivation involves some measure of approximation, since the 
method of transition probabilities is based on an approximate rather 
than on an exact integration of Schroedinger’s equation. Furthermore, 
although the derivation is applicable to systems in general, it assumes 
that our knowledge of the condition of a system is to be obtained from 
the particular kind of measurements that correspond to the eigenstates 
for some xmperturbed Hamiltonian for the system. Moreover, 

t Pauli, SommerfM Festschrift^ Leipzig, 1928, p. 30. 

30 


3695.25 



466 


THE QUANTUM MECHANICAL H-THEOREM Chap. XII 

although, the derivation tells us that S. can be expected to move in 
the direction of equilibrium at the time an ensemble is set up to repre- 
sent the results of an initial measurement, it does not provide all the 
information that would be desirable and possible as to the continued 
behaviour of the ensemble. The first two of these inadequacies will be 
removed and the last at least partially remedied by the derivation to 
be given in the next section. 

106. Change of E with time from the exact integration of the 

Schroedinger equation 

(a) The representative ensemble. We now turn to a more general, 
more rigorous, and more complete consideration of the generalized H- 
theorem. The treatment will be based on results obtained from the 
exact integration of the Schroedinger equation, rather than on the 
results as to transition probabilities obtained from the approximate 
integration of that equation by the method of variation of constants. 

In the interests of generality we shall now assume that we are going 
to be concerned with observations on the system of interest which could 
be regarded as approximate measurements of the value of any desired 
kind of quantity F pertaining to the system. Such observations can 
be taken as giving approximate information as to the probability that 
the system would actually behave as though in one or another of the 
precise eigenstates h, that would be determined by equations of the 

= (106.1) 

where F is the quantum mechanical operator, and the Fj^ and %(?) are 
the eigenvalues and eigenfunctions corresponding to the observable F. 
In accordance with this character of the contemplated observations, it 
will then be appropriate to treat the behaviour of the system of interest 
with the help of a representative ensemble described in the language 
provided by these eigenstates k. 

In order to set up such a representative ensemble, let us consider 
that we make an initial observation on the system of interest, at time 
/ == 0, which furnishes values at that time for the probabilities J^(0) 
for different groups of neighbouring eigenstates h between which 
our approximate measurements caimot distinguish. In accordance with 
the defimtion given in § 104 (o), we can then take the coarse-grained 
probabilities ii(0) for different states h at that time as given by 

-P*(0) = ^. 


(106.2) 



§ 106 CHANGE OF H IN GENERAL 467 

Hence, in accordance with onr fundamental hypothesis as to equal 
a priori probabilities and random a priori phases, we can now describe 
the initial state of the representative ensemble by the density matrix 

= ( 106 . 3 ) 

Thus the initial state of the ensemble, at time ^ = 0, is such as to give 
equal fine-grained and coarse-grained probabilities 

PkM = Pm (106.4) 

for any state h of the kind that concerns us, and such as to make the 
non-diagonal elements of the density matrix equal to zero, with 

Pa-/(0) == 0 1) (106.5) 

in the language that is being employed. 

To use this ensemble for predicting the result of later observations 
on the system of interest it will be necessary to have some principle 
which will describe the temporal behaviour of the coarse-grained proba- 
bilities Py,{t) for the different states h involved in our measurements. 
It is for this reason that we_are interested in determining the laws of 
behaviour for the quantity H, characterizing the distribution of coarse- 
grained probability in accordance with the definition of § 104 (6), 

% = (106.6) 

We shall show that the general behaviour of this quantity is to exhibit 
a tendency to decrease with time, and hence, as we shall examine in 
some detail in § 109, to proceed to as uniform a distribution of coarse- 
grained probabilities as is consistent with the energy principle. 

(6) Needed results of the exact integration of Schroedinger’s equation. 
In the interests o^ rigour we shall base our present demonstration of 
the tendency for M to decrease with time on results obtained from the 
exact integration of the generalized Schroedinger equation for isolated 
systems. This wiU also satisfy our interests for a more complete treat- 
ment of the temporal behaviour of S than coidd be obtained from the 
use of transition probabilities, since the exact integration provides 
expressions holding at any time t = t later than the initial time i = 0. 

In accordance with the exact integration of Schroedinger’s equation, 
as carried out in § 96, and as applied to the behaviour of ensembles in 
§ 101, we can take the fibae-grained probability p^{t) for finding a mem- 
ber of the ensemble in any state n at time t = ts& given by an equation 
(101.4) of the form 

k l^k 


(106.7) 



468 


THE QUANTUM MECHANICAL H-THEOKEM Chap. XII 


where the quantities Pjy(O) are the elements of the density matrix at 
an initial time i = 0, and the quantities are the elements of the 
unitary transformation matrix that correspond to the operator U(^) in 
the manner discussed in § 96(d). Furthermore, if the density matrix 
is diagoTUil at the initial time t = 0, as will be true in the cases of 
present interest as a consequence of the hypothesis of random a priori 
phases (see (106.3)), this expression will reduce to the simpler form 


( 101 . 6 ) 


/>«»(«) = I WO). 


(106.8) 


This result, together with the important summation properties for 
the elements of the transformation matrix, which correspond to its 
unitary character, ^ ^ 9^ 

n 

and = (106.10) 

k 

provide those consequences of the exact integration of Schroedinger’s 
equation which will be necessary for the demonstration. 

In addition to making use of these quantum mechanical results, it 
will also be necessary in carrying out the demonstration to make use of 
the essentially positive character of a certain combination of quantities, 
as given by the expression 


xlogx—xlogy—x+y > 0, (106.11) 


where x and y are themselves quantities which cannot assume negative 
values. The validity of this expression has already been demonstrated 
in § 61 (6). The equality sign applies only when x and y are equal, and 
the combination assumes more and more rapidly increasing positive 
values as y is made increasingly different from x. 

(c) The Klein relation as a necessary lemma. Before proceeding to 
our ultimate investigation of the temporal behaviour of the quantity 
^ PjjlogPj., by which S is defined, it wifi, first be necessary to deduce 

a lemma as to the behaviour of the similar quantity ^ Pa* log P**) con- 
taining fine-grained instead of coarse-grained probabilities. The rela- 
tion to be derived was first obtained by Klein.t 
To deduce this lemma we begin by defining an auxiliary quantity 

Qic. = P*fc(0)logpsfc(0)-p;tft(G)logp«„(i)-p;,fc(0)-hp„„(f), (106.12) 

where pa*( 0) and p^Jf) fine-grained probabilities for the states 

lo and n respectively at the initial time t = 0 and at any later time of 


t Klem, Zeita.f. Phye. 72, 767 (1931). 



§ 106 


CHANGE OF B IN GENERAL 


469 


interest t = t. In accordance with (106.11) and with the circumstance 
that the probabilities pkiJiO) and could not themselves be negative 
on account of their physical significance, is itself a quantity which 
cannot be less than zero. 

Multiplying the essentially positive quantity by the essentially 
positive quantity and summing over all values of n and of A', 

we then obtain a result which can be written out in detail in the form 


2 Wnk\^Qkn — 2 2 2 2 ^*k^nkPkk{^'^^SPnni^)~ 

n,K k n UK 

VnkPkkm+ 2 2 TJtu U„kPnn{i) 

k n n k 


> 0 . 


(106.13) 


By applying the consequence of Schroedinger’s equation given by 
(106.8) to the second of the explicit terms given above, and by applying 
the summation properties for the elements of the transformation matrix 
given either by (106.9) or (106.10) to the other three terms, we then 
obtain the result 


2Pu(0)log/>**(0)— Xp«n(01og/>„„(i)— y />A.fc(0)+ 2 p««(0 > 0- 

k 71 K 71 

Or, since the summations of the probabilities and p„„(i) over all 
states will both have the value unity and hence cancel out, this can 
also be written in the finally desired form 

2 PA-A-(0)logPK-(0) > 2 P«nWlogP„„(^). (106.14) 

k Th 

The result given by (106.14) may be called the Edein relation. Its 
validity depends on the diagonal character of the density matrix 
at the time of initial observation ^ = 0, since otherwise it would be 
necessary to use the complete expression (106.7) rather than the sim- 
plified expression (106.8) ia evaluatiog the probabilities p^ni^) 
t = t. This initial character of the density matrix will be justified in 
our applications, nevertheless, by the fundamental hypothesis as to 
a priori phases. 

It is of interest to compare the Klein inequality as given by (106.14) 
with the quantum mechanical equality 

T [p(0)logp(0)]AA, = 2 [p(01ogp(*)]„„, (106.16) 

K n- 

which can be readily obtained as an example of the general invariance 
of the trace of a matrix towards unitary transformation. The quantities 

Pa-; is itself diagonal. 


^ [piogPjfcA- become identical when the density matrix 



470 THE QUANTUM MECHANICAL H-THEOREM Chap. XII 

It is also of interest to maJke a comparison with the classical equality 
(61.12) 

J ... J />(0)logp(0) dgi ... dpf = J ... J p(t)logp(t) dq^ ... d^f (106.16) 

which played an essential role in the derivation of the classical form of 
the generalized J?-theorem. In the derivation of the quantum form 
of the theorem the similar role will be played by the Klein inequahty 
(106.14) rather than by the equality (106.16). 

The quantum mechanical possibility for the quantity ^ p*fclog/>*fc to 

exhibit decreased values at times t = t later than the initial time t = Q 
may well be contrasted with the classical constancy of the quantity 
J ... Jplogp dgi ... djj/. Thus the quantum mechanics implies some 
tendency, as time proceeds, towards a spreading out of fine-grained 
probability over different states which has no analogue in the classical 
mechanics. This quantum mechanical tendency towards increased uni- 
formity of distribution over precisely defined states arises, of course, 
firom the statistical features inherent in the exact quantum mechanics 
itself even when applied directly to a single system. This is illustrated 
by the finrling that a quantum mechanical system started out initially 
at f = 0 with unit quantum mechanical probability, T^(0) = 1, for 
being in a single particular precise state h must be expected at a later 
time i f to exhibit a distribution of such probabilities Wj^t) over 
various states n, as can be calculated with the help of (96.40). 

As often emphasized in the forgoing, such statistical features, which 
arise in the quantum, mechanics proper as applied to precisely specified 
states, must be kept distinct from those additional statistical features 
which arise, m what we specifically name statistical mechanics, from 
the circumstance that we then deal with systems in conditions which 
are not sufficiently defined for a precise specification of state. More 
partictdarly in the present case, the quantum mechanical tendency 
towards a more uniform distribution of fine-grained probabilities p^^ 
must not be confused, as has sometimes been done in the past, with 
the whole of the statistical mechanical tendency towards a more 
uniform distribution of coarse-grained probabilities J* as time 
proceeds. 

(d) Derivation of the generalized H-theorem. We are now ready to 
investigate the temporal behaviour of coarse-grained probabilities PjJif), 
by deriving the generalized fi"-theorem showing the tendency for 

^ = |:J5fclogPi • (106.17) 



471 


§ 106 CHANGE OF H IN GENERAL 

to decrease with time. To carry out the demonstration we may begin 
by writing 

S{0)-S{t) = y Pj.(0)logPi(0)- 2 Pnii)iogPn{t) (106.18) 

h n 

as an evident expression for the difference between the value of E at 
an initial time t = 0 and its value at any later time t = t. Moreover, 
in accordance with the circumstance that our ensemble will be set up 
initially to correspond to the results of an observation made on the 
system of interest at that time, it is evident, in agreement with (106.4), 
that we can substitute 

|P^(0)logPfc(0) = |/)itfc(0)logpi.,.(0) (106.19) 

in terms of the initial fine-gramed probabilities; and in accordance with 
the possibility of expressing E in either of the forms (104.7) or (104.8), 
it is evident that we can also substitute 

2 Pnmogm = 1 PnMogPnit), (106.20) 

n n 

where a partial change to fine-grained probabilities has been made. 
Doing so, our expression for the difference between the two values of 
H becomes 

f (0)-f (0 = 2 Puicmogp,^{0)- 2 P„„(01og^n(0. (106.21) 

k n 

To treat this expression, we may first add the IQein relation, as 
derived above, 

2 P*fc(0)log/»*fc(0) > 2 />«n(01og/)„„(f). (106.22) 

k n 

This gives us 

E{0)-3{t) > 2 PnnmogPnnif)- 1 Pn„(«)logP„(0. (106.23) 

n n 

Adding and subtracting quantities which sum up to unity, this can 
then be rewritten in the form 

H{0)—E{t) > 2[Pn»(01ogp„„(«)— /)„„(<)logP„(<)— p„„(0-l-Pn(0]- 

” (106.24) 

In accordance with (106.11), however, we note that the quantity in 
square brackets, which is summed over all values of n, has a form which, 
is essentially positive, so that we can write 

2I>»»(01ogp„„(0-Pn»(«)logP„(<)-Pnn(«)+P«(<)] >0, (106.26) 
n 

with the equality sign holding only with /)„„(f) equal to P^{t) for all 
states ». This then provides all that is necessary for the demonstration 



472 


THE QUANTUM MECHANICAL H-THEOBEM Chap. XII 


of the i?-theorem., since, by combining with (106.24), we are led to the 
expected expression E{0)—B{t) > 0 (106.26) 

as a description of the temporal behaviour of the quantity .S' in a repre- 
sentative ensemble which is set up at the time of initial observation 
< = 0 for the purpose of making predictions at any later time t = t &s 
to the condition of some system of interest. 

In accordance with this result, we see that the quantity H for such 
a representative ensemble could never exhibit at a later time a value 
larger than it had when the ensemble was initially set up. We also 
note, in accordance with (106.22) and (106.25), that H could exhibit 
a later value equal to its initial one only if the Klein relation were 
actually satisfied by the sign of equality rather than inequality, and 
if in addition the fine-grained and coarse-grained probabilities, /)„„(0 
and PfXt), were actually the same for all states n. These two require- 
ments would indeed be met in the case of an ensemble set up to represent 
a condition of equilibrium with the chosen so as to be independent 
of time and equal to for all states n. Nev^heless, except in the case 
of eqi^brium, the possibility for a value E{t) as large as the initial 
one E(0) must evidently be regard^ as very exceptional. 

_In addition to concluding that S(t) will in general be smaller than 
S(0) in non-equilibrium cases, it also appears reasonable to conclude 
that the ensemble — at least for a considerable period after the initial time 
i_== 0 — ^would in general exhibit successively lower and lower values of 
S(i) as we go to later and later times t = t. Here, as in the analogous 
part of the classical treatment, it is evident that our arguments can be 
only of a qualitative and plausible kindjf imless we should be willing and 
able to apply the methods of exact mechanics to a specific example. 

To apply general, qualitative arguments, let us consider an ensemble 
which is set up at f = 0 to correspond to an initial observation which 
we take for dmplicity as showing that the system of interest might be 

t The qualitative arguments given here may he regarded to some extent as supple- 
mented by the recent results of Paiali and Jlerz, Zeits. /. 106, 572 (1937), who 

have shown, by introducing a somewhat conventional definition of the probabfixties of 
making different kinds of observations, that the long-time-average value of S for a 
perfectly isolated system would difier appreciably from the Tnim’Trmm possible value of 
S less and less frequently as the size of the groups characteristic of the observations 
made on the system increases indefinitely. It must be emphasized, however, that the 
times contemplated by Pauli and Pierz are far longer than any which can be of physical 
interest or than those which we shall actually consider in §§ 109 and 111, and, further, 
that these authors purposely neglect those contributions to the decrease of 5 with time 
which are a direct consequence of the quantum mechanical disturbance of the system 
by observations and which we shall discuss under the term quantum mechmiical 
spreading. 



CHANGE OF H IN GENERAL 


473 


§ 106 


equally well in any one of a single group k of neighbouring states h 
between which our approximate measizrements do not distinguish. At 
the initial time ^ = 0, as a consequence of our rules of procedure, we 
must then take Pfcfc(O) and P^.(O) equal, with the same positive value for 
all states h in the group /c, and with the value zero for all states not 
in the group k. As time proceeds, however, we may expect at least for 
a considerable interval after i = 0 a gradual increase in the probabilities 
Pnn^t) and for states outside the original group k and a decrease 
in probabilities for states inside that group. 

With such a gradual change in the state of the ensemble there would 
then be two factors operating which would tend to give continually 
decreasing values of S. In accordance with the first of the two in- 
equalities (106.22), on which the above derivation of the H-theorem 
has been based, one factor would consist in the circumstance that the 
occupation of more and more states n would lead to lower and lower 
values of ^Pnn^^SPnn- ^ accordance with the second of those 

n 

inequalities (106.25), a second factor would consist in the circumstance 
that the unequal rates at which would grow for different states 
outside the original group k and diminish for states inside that group 
would lead to the development of inequalities between the values of 
p^^ and P,i_for different states n. These two factors on which the 
decrease of S with time depends may well be given descriptive names. 

The first of the two factors, the tendency for the occupation of more 
and more states n to lead to lower and lower values of ^ log Pnn» 

n 

is a specific quantum effect which may be called the quantum mechanical 
spreading of fine-grained probability* There is no similar classical effect 
since the classical conservation of fine-grained probability p, as we 
follow any moving point in the phase space, leads to a constant value 
in time for the classical quantity J ... J plogp dq^ ... dpf. The quantum 
mechanical analogue of this classical result is the constancy of the 
matrix trace ^ [plogp];;.*, expressed by (106.16), and holding in any 

quantum mechanical language during the undisturbed behaviour of the 
ensemble. Hence the changes in Et connected with quantum mechanical 
spreading are not to be thought of as associated with the undisturbed 
development in time of the systems composing the ensemble, but rather 
with the irreversible disturbance associated with the act of subsequent 
observation on some particular kind of state n. It will be appreciated 
that the effects of quantum mechanical spreading are of predominant 

3695.25 3 p 



474 


THE QUANTUM MECHANICAL H-THBOREM Chap.XII 


importance when the observations on the system approach in definition 
the limits permitted by the quantum mechanics, and may be expected 
to be unimportant in the opposite limit of observations so crude that 
the accompanying disturbance of the system may be neglected. 

The second of the two factors on which the decrease in S with time 
depends, namely the tendency for inequalities to develop between the 
fine- and coarse-grained probabilities and is an effect, also 
present in the classical statistics, which may be called the inhomo- 
gmeoua redisirihution of fine-grained probability. In the c^ssical 
statistics it provided the only mechanism for the decrease in S with 
time. In the quantum statistics it will be appreciated that the effects 
of inhomogeneous redistribution will grow in importance with increasing 
size of the groups of states k involved in the observations made on the 
system, since as the size of these groups increases, on the one hand 
the distinction between fine- and coarse-grained probabilities becomes 
more marked, and, on the other hand, the disturbances produced by the 
observations themselves, i.e. the specific quantum mechanical features 
of the observations, become less and less important. In fact, it has 
recently been shown by Pauli and Pierz,t if we consider the limiting 
case where the groups k comprise a very large number of states, that 
the Inhomogeneous re^tribution is then alone sufficient to assure an 
ultimate decrease of iff to a value which in general differs from the 
minimum possible value less and less as the groups k increase in size. 

In view of the foregoing discussion, it is evident, in the case of an 
ensemble set up to represent the observation of a non-equilibrium con- 
dition, that both the factors of quantum mechanical spreading and of 
inhomogeneous redistribution could be expected to operate at least over 
a co^derable period of time in the direction of lower and lower values 
of S. Hence we shall now regard ourselves as justified in expecting 
not only that the quantity 

^ = |:J’*logP^ (106.27) 

can never assume a value larger than its initial one in the case of an 
ensemble set up to represent an approximate observation of the state 
k for some ^stem of interest, but also in general that this quantity 
would tend to assume smaller and smaUer values as time proceeds until 
we approach the minimum value of H permitted by the nature of the 
systems composing the ensemble and by such restrictions as are im- 
posed by the conservation of energy. As the final condition of the 
t Paiali and Merz, Zeita.f. Phya. 106, 072 (1937). 



§ 106 


CHANGE OF H IN GENERAL 


476 


ensemble, corresponding to equilibrium for the system of interest, we 
can then take as uniform a distribution of the coarse-grained proba- 
bilities for different states Z; as is physically possible. 

(e) Further discussion of the generalized JEf-theorem. In concluding 
this treatment of the generalized if-theorem, several further points of 
interest may be mentioned. 

In the first place- it is well to emphasize that the tendency towards 
a preferred direction of change, which has been found in the foregoing, 
is a consequence of the circumstance that our ensembles are initially 
set up in a particular kind of state, with equal fine-grained and coarse- 
grained probabilities and with random phases for all states 1c of the kind 
involved in observation. The directional character of the behaviour 
then resulting stands in no confidct with the possibility of specifying 
exact behaviours for a single system which would agree with the prin- 
ciple of dynamical reversibility, also valid in the quantum mechanics, 
as we have seen in § 95. It will be noted, however, that this directional 
character is now not only due to the development of inequalities 
between fine-grained and coarse-grained probabilities as in the classical 
statistics, but also to the purely quantum effect of spreading which 
would operate, for example, in the case of a single system started out 
in a single precisely defined state. 

It is of interest in this connexion to note again that the Elein relation 
(106.14), which controls the general character of the quantum effect 
of spreading when more than a single initial state is involved, has a 
validity .which definitely depends on the assumption of random phases 
at the initial time t = Q. The assumption of random phases at the 
later time t — t would make it necessary to reverse the sign of inequality 
in that expression. It is also of interest to recall once more our earlier 
finding, as demonstrated by (101.8), that an initial randomness of phase 
for the states h represented in an ensemble would in general be lost as 
time proceeds. We now see that the loss in the randomness of the 
phases of states can be regarded in a sense as compensated by a gain 
in the randomness of distribution overstates. 

In connexion with the tendency for S to decrease with time towards 
a minimum equilibrium value, inquiry might be made into the rate at 
which the approach to equilibrium would occur. It is clear that no 
general quantitative answers to this important question can be given 
since the rate would depend on the specific character of the system of 
interest. N^ertheless, it is evident that we can often expect a rapid 
decrease in H to start at once as a consequence of the combined effects 



476 


THE QUANTUM MECHANICAL H-THEOEEM Chap. XII 


of quantum mechanical spreading and of inhomogeneous redistribution, 
and that an important measure of redistribution in the values of the 
Pj. might take place in a short time. In the quantum as in the classical 
statistics there is no reason for assuming that very long periods of time 
would be necessary for the approach to equilibrium. 

In the foregoing treatment of the iT-theorem it was assumed that the 
representative ensemble, set up to correspond to an initial approximate 
measurement of a particular kind of quantity pertaining to the system 
of interest, was then to be used for making reasonable predictions as 
to the result of a later measurement on the system of just the same 
kind. To give a more ge^ral treatment, it wUl be noted that our 
derivation of the relation H(0)—S{t) > 0 depends in principle, in the 
first place, on the nature of the initial state of the ensemble as expressed 
by p*j(0) = P*(0) and, in the second place, on the summation pro- 
perties exhibited by the elements of the transformation matrix U^je in 
accordance with the unitary character of the transformation from * = 0 
to t = t. If, however, we denote by S^'n the elements of the unitary 
matrix for transforming at a given time from an original quantum 
mechanical language to another language corresponding to states n\ it 
would also be possible to consider a combined imitary transformation 
to a later time and to another language, with the help of the matrix 


elements 



n 


(106.28) 


which would themselves have unitary character in accordance with the 
discussion of § 67 (e). Using similar methods to those employed above 
in deriving the P-theorem, this then makes it possible to obtain an 
egression of the form 


F(0)-S'(«) = I P;,(0)logP*(0)- I PMogPnit) > 0, (106.29) 


provided we still assume the same initial character of the ensemble 
expressed by pja(0) = P*,(0) 8^. 

This result is of considerable interest in giving some information as 
to the probabilities for states of a different kind from those originally 
approximately observed. The result applies also if we merely transform 
to a new language at the original time t = 0. It demonstrates that we 
cannot increase B above its initial value either by going to a later time 
or by contemplating a different kind of observation. It is to be re- 
marked, however, that investigation shows that there are no longer the 
same reasons for expecting continually decreasing values of B' with time 
if the states n' are of a very different kind from the original states n. 



§106 


477 


CHANGE OF ti IN GENERAL 

We may close this long section on the generalized i?-theoreni by 
laying emphasis on the appropriate and powerful character of the 
statistical methods of Gibbs, also in the quantum mechanics, in inves- 
tigating the theory of temporal behaviour. 

107. Application of if -theorem to interacting systems 

In the foregoing derivation of the J?-theorem it has been assumed 
that the system of interest could be regarded as completely isolated from 
other systems and from its surroundings. It was this circumstance which 
made it appropriate to represent the system of interest by an ensemble 
of members, each of which was itself taken as isolated, and thus made 
it then possible to base the derivation on results obtained from the 
integration of the Schroedhiger equation in the form applying to isolated 
systems. It is evident, however, that we can also be interested in situa- 
tions where a transfer of energy could take place between two or more 
interacting systems or between an enclosed system and its surroundings. 

Such a situation would obviously arise when we wish to treat the 
behaviour of some system of interest which has been purposely placed 
in good ‘thermal contact’ with a heat reservoir. Such a situation would 
also arise, moreover, even when we wish to treat the behaviour of a 
practically isolated system, provided the possible fluctuation of energy 
between the system proper and its walls is not small enough so that its 
effects can be appropriately neglected. Indeed, as we shall see in the 
next chapter, the two cases, of a system in good ‘thermal contact’ with 
a heat reservoir or in what we shall call ‘essential’ rather than ‘perfect’ 
isolation from its surroundings, will be of primary importance in pro- 
viding a statistical mechanical explanation of the principles of thermo- 
dynamics as applied to the behaviour of ordinary macroscopic systems. 

In order to study the consequences of the if-theorem when applied 
to interacting systems, let us consider two individual systems and S2 
which can be placed in contact with each other in such a way as to 
form a combined system which can itself be regarded as isolated and 
hence treated by the methods already developed for isolated systems. 
Let us introduce symbols of the forms m^, etc., and mg, etc., 

to denote states, of the kind proposed for observation, for the two in- 
dividual systems and S2 respectively. And assuming weak interaction 
between the individual systems when in contact, let us regard the 
possible states for the combined system ^ consisting of associated 
pairs of the above states, which can be denoted by symbols of the form 
^ 1 ^ 2 > ^ 1 ^ 2 ? 



478 


THE QUANTUM MECHANICAL H-THEOBEM Chap. XII 


At some initial time f = 0 we may now regard ourselves as making 
approximate observations on the states of the two individual systems 
Si and S 2 , and if interaction has not already been established between 
Si and S 2 we may regard them as placed in contact with each other at 
f = 0 in order to form the combined system Si^. In accordance with 
our general statistical procedure, the initial conditions of the two 
systems Si and S 2 can then be represented by setting up a pair of 
ensembles, with the diagonal density matrices 


= “Id = (107.1) 

that correspond to our observational determination of the coarse- 
grained probabilities and and to the choice ^f random phases 
for different states. Furthermore, an appropriate ensemble to give a 
suitable representation of the initial condition of the combined system 
S 12 can evidently be set up by r^arding each member of the ensemble 
for Si as combined separately with each member of the ensemble for 
thus giving a new ensemble, with a number of members equal to 
the product of the numbers in the two unoombined ensembles. The 
initial density matrix for the combined ensemble would then be ex- 
pressed by Pk,kMW ~ (107.2) 

This matrix would be diagonal in agreement with the random phases 
in the two original ensembles; and the coarse-grained probabilities in 
the new ensemble would be given in terms of those for the original 


ensembles by 




(107.3) 


as a consequence of the method of construction. _ 

At the initial time # = 0 we may then write as expressions for E, 
corresponding to the distributions that represent the two individual 
systems Si and S^, 

^x(0) = JPft,(0)logP^,(0) and ^^(O) = jPfc,(0)logPj.(0). (107.4) 

And we may write as an expression for E, corresponding to the dis- 
tribution that represents the combined system Si^, 

~ ^ PjS!,]|!,(0)lOgP)i,jJ,,(0) 

=^’Pfe(0)P*.(0){logP^(0)-MogP*.(0)} 

= I Pfc(0)logP*.(0)-f J Pfe(0)logPfc.(0) 

= Si(0)-t-F2(0), (107.5) 

where the second form of writing is made possible by (107.3) and the 



§ 107 CHANGE OF E FOR COMBINED SYSTEMS 479 

next to the last form by the circumstance that the summation of a 
probability over all possible states will give unity. 

Let us now regard the combined ensemble as left undisturbed by 
external influences from the time ^ = 0 to some later time t L In 
accordance with our derivation of the fiT-theorem in ^e simple form 
for isolated systems, we may then expect a value for later 

time satisfying in any case the relation 

< fia(O), (107.6) 

and indee^ if we wait long enough, we may expect a tendency for the 
quantity to decrease to a final equilibrium value. 

At this later time t = t we may now consider the possibility of m akin g 
renewed observations on the two individual systems and 82 , and also 
of separating the two systems from each other if that should be desired. 
At this later time we can write 

and 3^{t) = '^Pk,{t)logPtJt) (107.7) 

as explicit expressions for the values of 5 corresponding to the dis- 
tributions of probabilities for states of the two individual systems 8^ 
and 82, and can write 

B^(t) = 

as an explicit expression for the value of H corresponding to the dis- 
tribution for the combined system 8 ^ 2 - possible in 

general, however, to set equal to since there may be a 

tendency for particular states for systems of the kind 8 ^ to be pre- 
ferentially associated with particular states ^2 systems of the kind 
82 * Nevertheless, we can in any case write the evident relations 

and " ’ Pj^(t) = I PWO. P*.(0 = J 

as a consequence of the significance of the quantities involved as ex- 
pressing probabilities. Combining (107.7) and (107.8), we may then 
write, with the help of (107.9) and (107.10), 

=^:^PW«)iogP*A(<)- |Pfc(*)iogPfa(<)- |Pfc.(<)iogPfe(<) 

= iW.(*)iog^W0-pw<)iog^fe(0Pft.(0-kfe(0+P;fc,(<)Pfe(0], 
^ (107,11) 



480 THE QUANTUM MECHANICAL H-THEOREM Chap. XII 

where in the last mode of writing we have added and subtracted terms 
which cancel out on summation. In accordance with the form of 
(107.11), we thus obtain a summation over terms of an essentially 
positive character (see (106.11)), and are hence led to the useful result 

< %S)- (107.12) 

Collecting together the expressions of special interest as given by 
(107.6), (107.6), and (107.12), and combining their consequences, we 

can then write S^{0)+Ei(0) = fia(O), (107.13) 

< |ia(0), (107.14) 

(107.16) 

S^it)+n^it) < ^i(0)+if2(0), (107.16) 

as a summary of restilts which will concern us in treating the statistical 
behaviour of a pair of individual systems and which can interact 
with each , other, over a time interval t = 0tot = t, ia the form of 
a combined system 

In accordance with the derivation _of the fi-theorem for an isolated 
system, it is seen that the value of for the distribution of proba- 
bilities representing the combined system can be expected to 
decrease with time towards the minimum value that would be physi- 
cally possible. And it is also seen,_as a consequence of the above, that 
the sum of the values of Ej^ and E^ for the distributions representing 
the two individual systems Sj^ and can nevOT be_expeoted to exhibit 
a value greater than at the initial time, jmce E^+E^ is originally equal 
■ to Hj^and is never again greater than E^^. It may be noted, however, 
that E^ and E^ do not in general each individually have to proceed to 
lower values, since an increase in the one may be compensated by a 
sufficient decrease in the other. 

We shall find the methods and results of this section specially useful 
in the present chapter when we wish to consider the equilibria of non- 
isolated systems, and in the next chapter when we wish to explain the 
thermod 3 mamlc consequences of placing systems in contact with each 
other. 

S. BELATION OP ff-THEOREM TO BEHAVIOUR AT EQUILIBRIUM 
108. Relation to previous studies of equilibrium 
As shown in the preceding part of this chapter, the iT-theorem proves 
very informixig in studying the mechanism by which the approach of 
a system towards equilibrium would take place. In the remainder 
of the chapter we now wish to make use of the theorem in developing 



§108 


APPLICATION TO EQUILIBRIUM 


481 


more fundamental and more powerful methods of studying the condi- 
tion of equilibrium itself than have been previously available to us. 
Before proceeding to that development we may first consider the rela- 
tion of the jff -theorem to our earlier studies of equilibrium. 

These earlier studies, as carried out in Chapter X, were based on the 
use of a microcanonical ensemble, uniformly distributed over a small 
energy range E to E-\-hE, as giving a suitable representation of a 
system in equilibrium with an energy lying in that range. Making use 
of the equal probabilities for different precise states in such an en- 
semble, we then calculated the numbers of such states 0 which would 
correspond to different observable conditions of the system of interest, 
and regarded the probabilities for these conditions as proportional to 
the corresponding values of O. We then took a maximum of G, under 
necessary subsidiary conditions as to total number of particles and total 
energy, as giving a determination of the average (most probable) condi- 
tion for the system at equilibrium. 

The relation of this procedure to the fi-theorem may now be readily 
seen. In accordance with (102.1), we have defined the quantity H for 
a system in a given specified condition by 

£r=-logG, (108.1) 

where 0 is the number of precise states that correspond to the condi- 
tion. And in accordance with the If-theorem, we have found a tendency, 
in the absence of equilibrium, for this quantity to decrease with time. 
Hence our earlier treatment may now be regarded as correlating the 
condition of equilibrium with that which might be expected as a highly 
probable final condition in a system left in isolation to carry out its 
own behaviour. 

Our earlier studies of equilibrium thus continue to appear reasonably 
satisfactory when examined in the light of our later understanding of 
the H-theorem. Nevertheless, from a theoretical point of view, it is 
evident that the determination of equilibrium, as corresponding to the 
maximum possible value of 0 and thus to the minimum possible value 
of H, does not quite give a satisfactory recognition to those fluctuations 
around the most probable condition which might be expected at equi- 
librium. And, from a computational point of view, it will be remem- 
bered that our previous* calculations of the molecTilar distributions to 
be expected at equilibrium were actually carried out for simplicity, 
with a somewhat unsatisfactory introduction of the Stirling approxima- 
tion for factorial numbers, which, as emphasized by Fowler, t might 

t Fowler, Statistical Mechanics^ second edition, Cambridge, 1936, p. 36. 

3596.26 3 Q 



482 THE QUANTUM MECHANICAL H-THEOREM Chap. XII 

even involve the use of that approximation for integers as small as zero 
or one. We shall be able to avoid these unsatisfactory features when 
we apply the more fundamental methods of treating equilibrium which 
we are now going to study. 

109, The long time behaviour of ensembles representing per- 
fectly isolated systems 

In order to develop more fundamental methods of treating equilibria 
it will be rational to begin by considering the long time behaviour of 
the representative ensembles that could be set up to correspond to 
systems of interest started off initially in conditions not corresponding 
to equihbrium. This will then give us the kind of information appro- 
priate for constructing ensembles to represent the ultimate conditions 
of equilibrium that would be reached by such systems. It will be con- 
venient to give separate attention to the two cases of systems which 
are isolated from their surroundings and systems where energy inter- 
change with the surroundings can take place. 

In the present section we consider the behaviour of ensembles corre- 
sponding to perfectly isolated systems. To obtain such an ensemble we 
may regard ourselves at some initial time ^ = 0 as making an approxi- 
mate observation on the system of interest which will give us a set of 
initial values i*(0) for the coarse-grained probabilities of states h of the 
kind observed. In accordance with our statistical methods, we may 
then set up an appropriate ensemble to represent the actual system of 
interest by taking a collection of similar systems distributed over 
different states fc, Z,... with the initial density matrix 

P«(0) = P*(0)S«. (109.1) 

On account of the character of the systems of interest now under con- 
sideration, it is to be noted that each member of this ensemble must 
itself be taken as perfectly isolated from its surroundings. 

As time proceeds, the distribution of the ensemble will undergo 
changes away from that described by (109.1), assuming that the initial 
situation does not itself happen to correspond to equilibrium. This 
behaviour of the ensemble can be characterized, in accordance with the 
generalized ff-theorem, by the tendency for the quantity S, defined by 

^ = |PfclogP* (109.2) 

summed over aU states £, to decrease with time to lower values. This 
tendency arises from the combined effect of two factors, which we have 
described as the quotum mechanical spreading and the inhomogeneous 



§ 109 LONG TIME BEHAVIOUE OP ISOLATED SYSTEMS 


483 


redistribution of fine-grained probabilities In the case of an en- 
semble, initially having a diagonal density matrix as in (109.1) and 
having an arbitrary distribution of coarse-grained probabilities i* in 
states k which are not themselves steady energy st^es, we can regard 
this tendency as making it practically certain that B will immediately 
start to decrease with time, and as making it highly probable that B 
will continue to decrease over a long period towards the wiiniTnum value 
which would be physically possible. 

As the ensemble carries out such a change with time, however, it is 
evident that the possible values of the quantities would be subject 
to restrictions. A quite general restriction immediately arises from the 
significance of these quantities as probabilities, which makes it neces- 
sary that their sum should at all times satisfy the relation 

|P& = 1. (109.3) 

In addition more specific restrictions arise firom the necessity for each 
member of the ensemble to carry out its behaviour in accordance with 
the laws of the quantum mechanics. 

In considering the restrictions imposed by the laws of mechanics on 
the possible values of the Pki we shall regard ourselves as primarily 
concerned only with those which arise from the principle of the con- 
servation of energy. This character of our concern is due, as in the 
corresponding classical considerations of § 61 (c), in the first place to 
the circumstance that we shall in any case take the ensemble as only 
containing members with such specified values for the components of 
total linear and angular momentum as are of interest, e.g. zero values, 
and in the second place to the circumstance that we regard the restric- 
tions imposed by the initial state of the ensemble on the values of other 
‘constants of the motion’ — ^more complicated and physically less signi- 
ficant than energy and momentum — as leading during the course of 
time to exclusion from otherwise possible states k, distributed at ran- 
dom within the groups of states involved in the determination of 
coarse-grained probabilities, in a manner that does not have physical 
consequences of interest. 

To study the character of the restrictions imposed on the behaviour 
of the ensemble by the energy principle, we note once more that each 
member of the ensemble must itself be regarded as perfectly isolated. 
Hence, in accordance with the quantum mechanical form of the law of 
the conservation of energy, each member of the ensemble would at all 
times have to exhibit unchanged probabilities for the difrerent possible 



484 


THE QUANTUM MECHANICAL H-THEOKEM Chap. XII 


values of its energy. Thus the application of the energy principle to 
the ensemble as a whole may be taken as requiring constant probability 
for fintiiTig a member of the ensemble in any specified narrow energy 
range, of width LE corresponding to the accuracy of our observations. 

In the special case, when our initial observation is made on imper- 
turbed energy states h of the system corresponding to eigenvalues E% 
which are close to the true eigenvalues of energy for the system, it is 
easy to give a simple, closely approximate expression to the above 
requirements of the energy principle. Under such circumstances the 
probability becomes exceedingly high that a system in state 1c would 
actually exhibit a value of energy very dose to the eigenvalue B%. This 
then makes it possible to write 

jEj—B+AJS 

Jk P* = P(P to P+AP), (109.4) 

where 'we sum over all values of k for which E^ lies in the range E to 
E+AE, as an approximate expression for the probability of finding 
a member of the system in the energy range E to E-\-AE; and this 
expression approaches exactness as the observed states k approach the 
true ene]^ states of the system. The requirements of the energy 
principle may then be imposed by taking the sum of the J*, over each 
such narrow energy range AE, as being constant in time, with a value 
which always remains equal to that which it had when the ensemble 
was initially set up. 

In the more general case, when our initial observation is made on 
states 1; of an arbitrary character, an explicit statement of the conse- 
quences of the energy principle may be obtained by making use of 
transformation theory (see (79.3)) to furnish expressions for the con- 
stant probabilities of any energy state n, in terms of the elements 
Pa of the density matrix used in describing the ensemble, and in terms 
of the elements of the unitary matrix used in transforming from 
the language of observation to that of energy. Such expressions, how- 
ever, are found to depend on the non-diagonal elements of the density 
matrix Pa and on the specific values of the S^i hi a manner that does 
not permit a simple general expression of the resulting restrictions on 
the possible values of the coarse-grained probabilities i^, independent 
of the nature of the system xmder consideration. As a rough approxima- 
tion, in the case of states k of an arbitrary kind, we could assume 
constancy for expressions of the form (109.4) now taken as summed 
over states k having expectation values of energy Ej^ which fall in the 



§ 109 LONG TIME BEHAVIOUR OE ISOLATED SYSTEMS 485 

various ranges E to E-\-A.E. This might not be a very exact procedure, 
however, since the probability for a value of the energy quite different 
from the expectation value could now be large. 

In accordance with the foregoing, it proves simplest in stud 3 nng the 
long time behaviour of representative ensembles to consider that the 
initial observation on the system of interest consists in an approximate 
determination of unperturbed energy eigenstates, corresponding to 
some suitable selection of unperturbed Hanultonian H® for the system. 
This should not be regarded as an inappropriate kind of state to observe, 
owing to our natural interest in the nearly steady conditions of the 
system to which such states correspond. Purthermore, this procedure 
should not be r^arded as making unnecessary the general treatment 
given to the if-theorem in § 106, since it was the discussion of_this 
general derivation which made it clear that the tendency for M to 
decrease with time could be expected to persist for a considerable period 
after the initial observation. 

Assuming such initial observation on nearly steady states }& of the 
system, we can now describe the long time behaviour of the repre- 
sentative ensemble, for any perfectly isolated system of interest started 
off in an arbitrary condition, as a tendency for the quantity 

^ = I^^fclogP* (109-5) 

k 

to decrease with time at least well along in the direction of the Tninimum 
value, which would he allowed by the restrictions imposed on the Pk> 

= (109.6) 

k 

as a consequence of the signiffcance of those quantities as probabilities, 

und B+A® 

2*: Jjj. = const., (109.7) 

for each narrow range in energy, as a consequence of the energy 
principle. 

Furthermore, applying the usual methods of determining a condi- 
tional TniTiiTmiTn , we s^ that the TnininmiTn physically possible value of 
the above quantity B would require substantially equal values 

Pji = const. (109.8) 

for the probabilities of the different states k which have eigenvalues 
E% lying inside of each of the individual ranges E to E-^AE, into which 
we take the total possible energy range as divided. Moreover, with the 



486 THE QUANTUM MECHANICAL H-XHEOREM Chap. XII 

help of our previous equation (106.6) applying to the rate of change of 
coarse-grained probabilities with time, we see, making use of (106.8), that 
such a uniform distribution for the probabilities i* within each energy 
range AS, if once attained, could be expected to exhibit a tendency to 
persist. Complete permanence would not be expected since this could 
be concluded from the equations mentioned only with random phases 
for the various states Js. 

We are thus provided with a good picture of the long time behaviour 
of such a representative ensemble, for a perfectly isolated system, as 
consisting in a tendency for the coarse-grained probabilities Jj. for 
different nearly steady states h to assume ultimate values which would 
be nearly constant in time and nearly the same for all states Jc lying 
in any one of the narrow ranges E to E-\-AE into which we divide the 
total energy. This picture does not in any way contradict the con- 
clusions to which we were led by the simple methods of § 106, but has 
been obtained from a more complete consideration of long time be- 
haviour in general, and gives attention to the possibility that more 
than a single energy range AE may have been originally populated 
with members of the ensemble. 

In the case of ensembles set up to represent the initial measurement 
of an curbitrarry kind of state h, such a simple picture of behaviour is 
not possible. We see qualitatively, however, that there will be a 
tendency towards equal probabilities J* for states h which make similar 
contributions to the probabilities for different energies. 

110. The microcanonical ensemble as representing equilibrium 
for a perfectly isolated system 

In accordance with the foregoing, if we make an initial approximate 
observation on the possible nearly steady states & of an isolated system 
of interest and then set up a representative ensemble to predict the 
future behaviour of that system, we can expect this ensemble to pro- 
ceed towards an ultimate, approximately steady distribution with the 
coarse-grained probabilities for different states Te appreciably de- 
pendent only on the corresponding energy. The ultimate condition 
approached by a system of interest, however, may be regarded as a 
condition of equilibrium, substantially unaffected by the character of 
any approximate measurements that may have been made on the 
system itself, and substantially unchanging in time. To take cognizance 
of these considerations in constructing ensembles for the general repre- 
sentation of equilibrium in the case of perfectly isolated systems, it 



§110 


USE OF MICKOCANONICAL ENSEMBLE 


487 


will be appropriate to choose distributions 

the general form rr \ ^ 

Pnm ^nm 


having a density matrix of 


( 110 . 1 ) 


when expressed in the true energy language, with dependent only 
on the energies of the various states n. 

Such a distribution would, as we know, remain strictly constant as 
time proceeds — ^see, for example, (81.5) — ^and hence give, as expected, 
a description of equihbrium that would not depend on time. Moreover, 
with a suitable choice for the form of f{E^)f it could be made to give 
any distribution of probabilities for different energies that might be 
needed to correspond to observations originally made on the system 
which it represents. Purthermore, from the close connexion between 
nearly steady states k and true energy states it would give coarse- 
grained probabilities for different states k that would appreciably 
depend only on the corresponding energy. Hence, in the case of an 
isolated system, such a distribution would meet all requirements appro- 
priate for representing the condition of equilibrium, including those 
which arise from treating this condition as that which would be ulti- 
mately expected if the system is left to its undisturbed behaviour. 

In what may be regarded as the typical applications of statistical 
mechanics to isolated systems, our initial observations of the system 
of interest may be taken as compatible with and as including a rough 
determination of energy. Such a determination will not be accurate 
enough to specify any precise stationary state for the system but will 
be accurate enough to fix the energy of the system within a range, say, 
E to E-{-hE^ which is quite narrow from a macroscopic point of view. 
Under such typical circumstances it will then be appropriate to repre- 
sent the condition of equilibrium by giving our general expression for 
the density matrix (110.1) the special form 


^ f PoKm (K ^ range E to E+SE), 

\ 0 (^^ not in range E to .E-1-8.B), 

with pQ a constant. Or, paying particular attention to the 
terms, we can write 


( 110 . 2 ) 

diagonal 


( po irt range E to E+iE), 

\ 0 {Ej^ not in range E to E+BE), 


(110.3) 


with Po a constant, as an expression for the equilibrium probability of 
fi nding the system of interest in any energy state n. In writing this 
expression we have taken it as appropriate to equate coarse- and fine- 
grained probabilities and p^^ since observations are possible m the 



488 THE QUANTUM MECHANICAL H-THEOREM Chap. XII 

case of equilibrium and of true energy states which can be carried out 
with any desired degree of accuracy by taking sufficient time. 

We are thus provided, also from the present point of view, with a 
justification for the microcanonical ensemble as representing equilibrium 
for a system of specified energy, and are hence provided with an apparatus 
for studying the equilibrium condition for any system for which we can 
calculate the character of the difierent energy states n. It will be 
appreciated, however, that such ensembles would be strictly applicable 
only in the case of perfectly isolated systems, where a definite specifica- 
tion of energy would be significant, f Nevertheless, such ensembles may 
also be used in the case of systems in contact with suitably chosen 
surroundings provided the resulting energy fiuctuations have an effect 
which can be neglected. This is a usual case for systems of many 
degrees of freedom. Having selected a suitable microcanonical en- 
semble, the expected properties of the corresponding system of interest 
can then be studied by taking any kind of average over the members 
of the ensemble that we may select. Thus we may compute the most 
probable values of quantities by the methods of Chapter X, or could 
compute mean values by the methods devised by Darwin and Fowler. 

111. The long time behaviour of ensembles representing sys- 
tems in contact with their surroundings 

(a) Probabilities for states of the combined system. We now turn to 
a consideration of the long time behaviour of ensembles representing 
systems in contact with their surroundings. The treatment will be 
based on our previous application of the H-theorem to two interacting 
systems as carried out in § 107. 

Employing the terminology of that section, we shall take as the 
system proper^ and 8^ as the surroundings with which it can interact; 
and shall regard their combination as giving a combined system 8-^^, 
which can itself be treated as isolated. Proceeding as before, we may 
then denote observed states of the system proper and its surroundings 
by symbols of the forms m^, etc., and Zg, mg, etc., respectively, 

and may regard the association of pairs of such states for its two parts 
as giving the possible states k^h^^ Xj^Zg, ZjZjg, etc., for the combined 
system. 

The extent of the surroundings to the system proper, which should 

t The application of the methods of statistical mechanics to the internal rearrange- 
ments that take place in an atomic nucleus after distuxbance—see Bohr, Science, 86, 
161 (1937) — ^provides a good example of a system which could be appropriately treated 
as perfecUy isolated. 



§111 LONG TIME BEHAVIOUB OF NON-ISOLATED SYSTEMS 489 


be included in order to seciure a combined system that can be appro- 
priately treated as isolated, would depend on specific circumstances. In 
the case of a situation where a study of the consequences of thermal 
interchange with the surroundings is definitely desired, it might be an 
appropriate idealization to regard the combined system as consisting 
of the system proper together with a suitable heat bath in which we 
could think of the system as being immersed. On the other hand, in 
the case of a situation where thermal interchange with the surroundings 
is not desired, but where the system proper can nevertheless interact 
with the walls of its container or other parts of its immediate environ- 
ment, it might be necessary to regard the combined system as including 
more and more extensive surroundings as we treat processes of longer 
and longer duration. 

In accordance with the methods of § 107, we may now consider our- 
selves at some initial time < = 0 as making approximate observations 
of the condition of the system proper and its surroundings Such 
observations will give us initial values for the coarse-grained proba- 
bilities P]ej,0) and PjeJiO) for states of the system proper and its sur- 
roundings respectively, and thus wiU also provide initial values 

-PWO) = (111-1) 

for the coarse-grained probabiKties for states of the combined system. 
With the help of these findings we can then set up a representative 
ensemble, with the initial diagonal density matrix 




in order to make predictions as to the farther behaviour of the com- 
bined system. 

As time proceeds, the distribution of the ensemble will in general 
undergo changes away from that described by (111.2). In agreement 
with our previous treatment of isolated systems in § 109, this behaviour 
of the ensemble can be regarded as characterized by a tendency for 
the quantity ^ P^j^logP,^, (111.3) 

kxtJCa 


to decrease with time, at least well along towards the TniuiniTiTn value 
that could be attained under the appropriate restrictions on the pos- 
sible values of the Pjcjc,- The first of these restrictions is a simple 
consequence of the significance of those quantities as probabilities which 
can be expressed in the form 

■^ 1 *. = 1 - 
3 B 


2 

kxiJCt 


3505.25 


(111.4) 



490 


THE QUANTUM MECHANICAL H-THEOREM Chap. XII 


The remaining restrictions, which it seems necessary to consider, are 
a consequence of the energy principle. We may assume for simplicity, 
as in § 109, that the states under consideration are nearly steady states, 
which approximate to the true energy states of the combined system, 
and may assume weak enough interaction between system proper and 
surroundings so that the sum of the unperturbed energy eigenvalues 
will be a close expression for the energy of the combined 
system when in state We may then write 

JEf+ 

' Pfcfr = P(,E to E+LE), (111.5) 

summed over the indicated states, as a close approximation for the 
probability of fintling a member of the ensemble in any specified energy 
range of width LE corresponding to the accuracy of our observations. 
The restrictions imposed by the energy principle may now be expressed 
by requiring constant values in time for such sums taken over each 
energy range E to E-\-tt.E which is appreciably populated. 

Applying the usual methods of determining a conditional minimum, 
we then see that the minimum physically possible value of the quantity 
•^12> as given by (111.3) under the restrictions expressed by (111.4) and 
(111.5), would require substantially equal values 

•PftA = const. (111.6) 

for the probabilities of the different states which have energies 
E%^-\-E%^ lying inside of each one of the individual ranges E to E-\-AE 
into which we regard the total possible energy range as divided. 

In agreement with our previous discussion of (109.8), this now per- 
mits us to describe the long time behaviour of the ensemble, repre- 
senting a system proper in contact with its surroimdings, as a tendency 
towards the ultimate assumption of probabilities for states of the 
combined system, which would be approximately equal for all states 
ifcj lying inside of any specified narrow energy range E to E+^ik.E that 
we wish to coimder, and which would remain approximately constant 
in time when once reached. 

(6) Number of states of surroimdings as a function of energy. Al- 
though the foregoing gives us information as to the probabilities that 
may be ultimately expected for states k^k^ of the combined system, it 
does not yet provide the information that we desire as to the proba- 
bilities for states k^ of the system proper. In order to obtain such 
information, it will first be necessary to introduce, by way of digression, 
a discussion of the distribution of states of the suiroundii^s as a func- 



§111 LONG TIME BEHAVIOUR OF NON-ISOLATED SYSTEMS 401 


tion. of energy. This will then permit us to find the number of states 
Ic^k^, which would lie in any selected energy range E to E-\-l^E with 
a specified value of and with any possible value of From the 
equal probabilities for all states k-^k^ of the combined system within 
any such range, we can then draw the desired conclusions as to the 
relative probabilities for different states k^ of the system proper. 

The surroundiogs which we shall usually wish to consider will be 
some ordinary macroscopic structure of many degrees of freedom — such 
as a heat bath, the walls of a suitable container, or the environment 
provided by a laboratory — ^which will be of sufficient extent and high 
enough energy content, so that its energy will he in a range of the 
spectrum which can be treated as practically continuous. Under such 
circumstances the number of unperturbed energy eigenstates for a 
system increases very rapidly with energy in a manner which can be 
appropriately described by an expression of the form 


dG{E) 

dE 


= CE\ 


( 111 . 7 ) 


where Q{E) is the number of eigenstates of energy equal to or less than 
E, C is & constant, and the exponent n proves to be a large number 
of the order of the number of degrees of freedom of the system which 
have been excited. In many simple cases the exponent n is found to be 
a constant. And in general n may be taken as varying only gradually 
with the energy E, since the main part of the rapid increase in number 
of states with energy is found to be given by the proportionality of 
dO/dE to a power of E which is of the order of the number of degrees 
of freedom of the system. 

We may illustrate these remarks as to the character of the above 
expression for the distribution of imperturbed energy eigenstates by 
giving detailed evaluations of the distribution in the case of two simple 
quantum mechanical systems. The evaluations will be found to depend 
on a known tjqje of definite integral, of the kind studied by Dirichlet, 
for which we can, write the formuk,f 





r(4i)r(4g) ... r(%) 

r(ii+i2+...+ijyf+l) ’ 


< E, (111.8) 


where the integration is taken over all positive values of e^, eg,..., ejf for 


t This is a special case of equation (11) in Appendix II. 



402 


THE QUANTUM MECHANICAL H-THEOREM Chap. XH 


which the sum is less than some specified value E, and % may 

be any set of numbers greater than zero. 

As the first example for which we shall give a detailed evaluation 
of the distribution of energy states, we choose a system of N weakly 
interacting harmonic oscillators of jfrequency v. In accordance with the 
energy spectrum for such oscillators as given by (72.3), we can take 

gr(A€)=lA€ (111.9) 


as an expression for the number of eigenstates for a single oscillator 
that would lie in any energy range e to e+Ac, provided that Ae is large 
compared with hv, a condition which restricts our final result (111.11) 
to the essentially classical limit where the mean energy of the oscillators 
is large compared with hv. For the number of unperturbed eigenstates 
for the whole system of N oscillators that would have energies less than 
a specified value we may then write 




( 111 . 10 ) 


where the integration is to be taken over all values for which the sum 
of the energies of the individual oscillators is not greater than the total 
energy of interest E. Evaluating the integral on the right-hand side 
of this expression, with the help of (111.8), choosing unity as the value 
of the quantities 42,..., %» obtain 

And dMerentiating with respect to E, we then have 


dOjE) [lY N 
dE ~[hv} T{N+1) 


( 111 . 12 ) 


as a specific filustration of our general expression (111.7), with the value 
of the constant C explicitly expressed, and with the exponent n m this 
case equal to one less than the number of degrees of freedom of the 
system. 

As a second example of a specific system we may choose a dilute gas 
composed of N weakly interacting simple similar particles of mass m 
in a container of volume •«;. Making use of our previous expression 


(71.16) 


S'(Ae) = 


4irrvm 


*J{2me) Ac, 


(111.13) 



§111 LONG TIME BEHAVIOUR OF NON-ISOLATED SYSTEMS 403 


for the number of states for a single particle in the energy range e to 
e+Ae, we can then immedifitely write 

/•••// •14- •S,*!*! ■■■*« 

^ ^ (111.14) 

as an expression for the number of states of the N particles of the gas 
having a total energy equal to or less than E. The factor 1/.^?"! in this 
expression arises from the circumstance that we do not regard the 
interchange of similar particles in a quantum mechanical system as 
leading to a new state; the integration is over aU values for which the 
sum of the energies of the particles is not greater than the total energy 
of interest E; and we assume this latter high enough so that we can 
neglect the necessity for giving special consideration to portions of the 
above integral which would correspond to more than a single particle 
in the same elementary state. Evaluating the right-hand side of (11 1. 14) 
by another application of the formula of integration ( 111 . 8 ), this time 
taking f as the value of the quantities 


we obtain 


0(^E) — — {r(§)}^ (111.15) 

^ ’ N\[ A8 ) r(P-hl) ^ ^ 


And, differentiating vfith respect to E, we now have 

dO(E) _ 3 1 /2W*\^ {r(i)}^ 

dE ~2(i^-l)!^ ) r(iiV-fl) 


(111.16) 


as a second illustration of our general expression (111.7), again with 
the constant C given explicit expression, and with the exponent n in 
this case equal to one less than half the number of degrees of freedom 
of the system. 

These two systems, which we have chosen for the detailed illustration 
of our general expression (111.7) for the distribution of unperturbed 
energy states, may be regarded as reasonably typical, although rather 
specially simple, examples of the kind of surroundings that it will be 
natural to consider. Moreover, from the character of the general 
formula of integration (111.8) on which our evaluations of the above 
distributions have depended, it will be appi^ciated that similar results 
may be expected in general in the case of other systems composed of 
weakly interacting elements. When the mean energy of the elements 
is large compared with the pacing of their energy levels, and the 
density of these levels per unit energy range may be taken to vary 
with a constant power of the energy, it wiU be seen that our general 



494 


THE QUANTUM MECHANICAL H-THEOBEM Chap. XII 


expression (111.7) for the distribution of the states of the whole system 
could be expected to hold with the exponent n strictly constant. And in 
less special cases it wiU be seen that the exponent n could be reasonably 
supposed to vary only gradually with the total energy B. This, then, 
confirms the original remarks which we have made in connexion with 
our general expression (111.7) for the distribution of states. 

In view of the foregoing discussion, we may now conclude this 
digression by writing, in agreement with (111.7), 

= (111.17) 

as a reasonable expression for the number of states of the surroundings 
8^ which would lie in any small energy range E^ to ^2 

a constant and ng number which does not vary rapidly with the 
energy E^ and which is of the order of the number of degrees of freedom 
of the surroundings. These quite moderate requirements as to the 
character of the surroundings 8^ will be all that is needed for our 
investigation of the ultimate probabilities for different states h-y of 
the system proper 8^. 

(c) Probabilities for states of the system proper. In accordance with 
(111.6), we have found a tendency for the combined ensemble to assume 
an ultimate condition such that there would be equal probabilities 
for all states in each range E to E+LE for the total energy of 
the system proper plus its surroundings. Hence, if we fix our attention, 
for the time being, on that portion of the representative ensemble for 
the combined system having members which lie in one particular such 
energy range E to E+AE, we can set the probability for any 
particular state of the system proper proportional to the number of 
states k^ of the surroundings that would have energies lying in 

the range E-E^ <E%,< E-El^+AE, (111.18) 

where E\^ is the energy of the system proper in the particular state k-y 
under consideration. With the help of the explicit expression given by 
(111.17) for the number of states of the surroundings lying in such an 
energy range, we can then write 

= Qoaat.C^iE^”* AE 
= (ioa:^.CJ^E—E%^'^ AE 
= con8t.(Jl-I?g^)»« (111.19) 

as an expression giving the probabilities of finding different states hy 
for the system proper within that portion of the ensemble which we 



§ in LONG TIME BEHAVIOUR OF NON-ISOLATED SYSTEMS 495 


are now considering. Furthermore, since E will be substantially equal 
to the smn of the mean energies of the system and its surround- 

ings in the narrow range of total energies Ai?, we can rewrite this ex- 
pression in the form 

= const.(fi-f fa— 

= / W J59\W2 

= const.(P2)«y-f- ^ . (111.20) 

We shall now be interested in showing that this result can be re- 
expressed in a very simple form for states of the system proper having 
energies E%^ near enough to the mean energy f ^ for the system proper 
to satisfy 

• ( 111 . 21 ) 

f) 

This restriction will have two effects. In the first place, since we can 
neglect higher powers of the above ratio, it permits us to change into 
exponential form by making the substitution 

1-^- ^^— ‘ (111.22) 

Ez 

In the second place, since we shall evidently have the approximate 
equality E-^—E%^ = fj, where is the energy for a state of the 
surroundings compatible jiith thei particular state of the system \ 
under consideration and and f 2 mean energies in the narrow 
range of total energy LE, it will make it feasible to expand around 
the value which it has at and write 

»2 = ^2 + ( 111 . 23 ) 

where we use K 2 and MJdEz as convenient symbols for the values of 
%2 and dnJdEz at E^ = and where we drop higher terms in _acoor- 
dance with the assumed smaUness of |fi— compared with Pg# 
in accordance with our previous finding that would at most be a 
slowly varying function of E^. 

Substituting (11L22) and (111.23) in (111.20), retaining terms only to 
the first order in {E^—E^), it will be seen that we can then write 




496 


THE QUANTUM MECHANICAL H-THEOBEM Chap.XH 


Or defining two constants 0 and j8 by 




(111.26) 

and 


(111.26) 

we now obtain 

= Ce-?^ 

(111.27) 


as the desired simple expression for the ultimate probabilities of states 
of the system proper, where it will be remembered that we are con- 
fining our attention to states of the system proper haring energies 
close enough to the mean to satisfy (111.21), and are confining our 
attention for the time being only to a particular selected energy range 
Jr to E+AE for the combined system proper and surroundings. 

With regard to the circumstance that the simple distribution law 
(111.27) only applies to states Jsi of the system proper haring energies 
satisfying (111.21), it will be appreciated that the law would, nerer- 
theless, be substantially raJid for all states of importance whenerer the 
energy content and number of degrees of Jfreedom for the surroundings 
are large compared with those quantities for the system proper. To 
examine this in detail we may return to (111.20), and, by introducing 
the appropriate fisictor for the number of states of the system proper 
that would hare energies in a range dE^, may write 

P{Ej)dEj_ = const.(J?i)M^*)»>^l-i-fc^ij”’dJ?i 
= oonst.(f i+f 2 -J? 2 )“>(f 2 )»»|l 

= oonst.(fi)»‘x(f^)n,|l dEi 

(111.28) 

as an expression for the probability of finding the system proper in 
different energy ranges dE-^. We then indeed see, with >• and 
«2 ^ %> tfi® whole range of large probabilities could be corered 
without contradicting the restriction (111.21), 




§ 111 LONG TIME BEHAVIOUR OF NON-ISOLATED SYSTEMS 497 


on which our previous consi^rations were based, since with {E^—Ey) 
positive the large factor EJ^y^ in the next to the last parenthesis in 
(ni.28) would be operative in leading to low probabilities, and with 
(Ey^—Ey) negative the large exponent Wg for the last parenthesis would 
so operate. Hence we can regard our simple formula (111.27) as sub- 
stantially valid for all states of importance provided the surroundings 
are sufficiently large compared with the system proper. 

With regard to the circumstance that the foregoing considerations 
have all been concerned with members of the combined ensemble lying 
in one particular energy range, a separate expression of the form 
{111.27) would have to be appKed to each energy range E to 
for the combination of system proper and surroundings, the constant C 
being determined by the probability of findiTig a member of the com- 
bined ensemble in that range, and the constan^jS being determined in 
accordance with (111.26) by the mean energy of the surroundings 
for members of the combined ensemble in the range mentioned. It will 
be noted, nevertheless, that the constant j3 would have substantially 
the samejp^alue for neighbouring ranges E to E-\-KE, since the mean 
energies for the surroundings would be practically the same in such 
neighbouring ranges for the total energy. It will be appropriate, more- 
over, to concern ourselves with situations where our initial knowledge 
of condition is such as to give a high concentration of the probability 
for different values of the total energy in the neighbourhood of some 
particular range E to jE?-{-AJE. Thus, with an appropriate value for the 
constant (7, it will also be possible to take an expression of the form 
(111.27) as applying to the whole ensemble. 

Hence, by way of summary, we may now describe the long time 
behaviour of a system of interest Sy^ in contact with its surroundings 8^ 
as characterized by a tendency for the corresponding statistical repre- 
sentation to proceed to an ultimate nearly steady condition, such that 
the distribution of coarse-grained probabilities for the different imper- 
turbed energy states hy^ of the system proper would be given by the 
simple formula . 

(111.59) 


where C and jS would be constants, and the formula would be valid 
oyer a range of energies in the neighbourhood of the mean energy 
for the system proper which would be related to the mean energy E^^ 
of the suiromidiiigs by | ^ (111.30) 

this range being wide enough, in the case of surroundings very much 


3595.25 


3S 



498 THE QUANTUM MECHANICAL H-THEOEEM Chap. XII 

larger than the system, to include substantially aU energies of im- 
portance. 

(d) The concepts of thermal equalization, essential isolation, and 
essentially adiabatic process. In closing this discussion of the behaviour 
that can be expected for a system which interacts with its surroundings, 
it will prove convenient to distinguish three idealized types of such 
interaction which provide useful abstractions. These may be denoted 
by the terms thermal equalization, essential isolation, and essentially 
adiabatic process, and, assuming the allowance of sufficient time, will all 
three be taken as leading to or involving a distribution of probabilities 
for states Aj of the system proper which can be substantially de- 
scribed by the simple formula (111.29). 

In a situation which we idealize as a case of thermal eqvalization we 
shall take the system proper as purposely placed in good thermal con- 
tact with a very large heat reservoir, having a large capacity for 
furnishing or taking up energy, and a mean energy and number of 
degrees of freedom which could be taken as going to infinity. Under 
such limiting circumstances it will be seen from the foregoing discussion 
that we could regard our formula (111.29), for the ultimate probabilities 
of different states of the system proper, as approaching validity over 
the entire range of energies 

In a situation which we idealize as a case of essential isolation we 
shall take the system proper as not purposely placed in thermal contact 
with its surroundings, and on the average as having no net interchange 
of energy with the outside in a series of similar experiments, but never- 
theless as necessasHy in contact with immediate surroundings, such as 
the walls of a container or portions of the laboratory environment. 
Such a situation could be represented by a combined ensemble for the 
system and its surroundings, with the possibility for any representative 
of the system proper to adjust its energy by interchange with the 
associated representative for the surroundings, but with the special 
proviso that the surroundings are so chosen as to secure constant mean 
en&rgy for the representatives of the system proper. Smce this special 
feature of the surroundings would not disturb our previous considera- 
tions as to the long time behaviour of the combined ensemble, the 
ultimate probabilities for different states h of the system proper would 
then be desoribed_by equation (111.29) over a range of energies 
around the mean having a width which would depend on the extent 
of the stuToundings that interact with the system. In the situations 
to which we apply the idealization of essential isolation we regard the 



§111 


DISCUSSION OF DEGBEES OF ISOLATION 


498 


extent of the surroundings as sufficient to justify a use of the simple 
exponential law (111.29) as providing a sufficiently valid complete 
description of the ultimate probability distribution. 

Finally, in a situation which we idealize as an essentially adiabatic 
process, we shall take the system proper as having the limit ed kind of 
contact with its immediate surroundings which was described above 
but as subjected to changes in the value of some macroscopic external 
parameter, such as volume, which describes its condition. An essen- 
tially adiabatic process would thus be one in which the mean energy 
ascribed to the system proper would not be changed by ‘thermal flow’ 
from the surroundings, but might be changed by ‘mechanical action’ 
from the outside. Furthermore, in situations to which we apply the 
idealization of essentially adiabatic process, we regard the extent of 
the surroundings that interact with the system proper as sufficient, so 
that by carrying out the process extremely slowly it would be justifiable 
at each state of the change to use the simple exponential law (111.29) 
as though it gave a completely valid description of the probability 
distribution. 

The three abstract idealizations of thermal equalization, essential 
isolation, and essentially adiabatic process which we have thus intro- 
duced are seen to be quite different in character from the previous 
concept of perfect isolation as treated in §§ 109 and 110. Several 
remarks may be made with regard to the actual situations in which it 
would be appropriate to apply the new concepts. 

It is at once evident that the idealization which we have called 
thermal equalization would provide suitable statistical representation 
for the processes which ensue when a system is immersed in a large heat 
bath, and further discussion of this case will not be needed. 

With regard to the applicability of the concept of essential isolation 
somewhat more discussion is necessary. It will be appreciated on re- 
flection, however, that this idealization would often provide suitable 
statistical representation for practical situations in which there is no 
intention of permitting any flow of energy to or from a system of 
interest, and in which the system would ordinarily be spoken of as 
isolated, but in which it would not be appropriate to neglect the pos- 
sibility for fluctuations to take place in the energy of the system proper, 
through interaction with actual surroundings, to an extent sufficient to 
make the simple exponential law (111.29) a natural one to take as 
giving an approximately valid description of the ultimate probabilities 
for different states. 



600 THE QUANTUM MECHANICAL H-THEORBM Chap. XII 

Some of the factors which tend to make this procedure quite fre- 
quently appropriate may be mentioned here. When the system proper 
is actually a very small one, the extent of the surroundings that must 
be considered as effectively concerned will tend to stand in a large 
ratio to the extent of the system proper, thus giving the conditions 
necessary for a wide range of validity for (111.29). When the system 
proper is actually a very large one of many degrees of freedom, it turns 
out — as we shall appreciate more fuUy in what follows — ^that the 
statistical results obtained tend to be relatively independent of the 
exact form of law chosen for the distribution of ultimate probabilities 
provided it gives — as does (111.29) — equal probabilities for states of 
equal energy and a high concentration in the neighbourhood of a parti- 
cular energy. !Purthermore, whatever the actual size of the system 
proper, it will be appreciated with any given degree of separation 
between sjrstem proper and surroundings, that the extent of the en- 
vironment which would have to be regarded as effectively interacting 
with the system proper would in general increase as we go to longer and 
longer tunes in considering our ultimate probability distributions. 

It will also be appreciated, in the case of a system having some 
contact with its surroimdings, that the concept of perfect isolation 
would often be a dearly inexact idealization to employ, that the best 
possible idealization would depend on the actual degree of isolation 
achieved together with the length of time allowed for interaction with 
the environment in too complicated a manner for ready determination 
or use, and that the concept of essential isolation would in any case 
provide a distribution law for ultimate probabilities which would have 
some range of substantial validity in the neighbourhood of the mean 
energy. This often makes the concept of essential isolation really the 
most reasonable one to use in developing general ideas as to the long 
time behaviour of the so-called isolated systems commonly encountered 
in nature or employed in laboratory or engineering technique. 

With r^ard to the applicability of the concept of essentially adiabatic 
process as an idealization for actual processes, it will be appreciated 
that the necessary points have been covered in the foregoing paragraphs 
on the applicabilily of the concept of essential isolation, since we merely 
have to add the idea of making changes in the external parameters for 
systems which are otherwise left in essential isolation. We shall find 
the idea of essentially adiabatic processes specially important when we 
come to the* statistical explanation of the principles of thermodynamics 
in the next chapter. 



DISCUSSIoS' OF DEGREES OF ISOLATION 


501 


§ 111 

In conclusion it may be remarked that the concepts of perfect isola- 
tion and thermal equalization have often been employed in carrying out 
statistical mechanical considerations, but that the concepts of essential 
isolation and essentially adiabatic process are newly introduced here.f 
This has been done not only in the interest of obtaining closer and less 
abstract idealizations for certain actual situations, but ,^o, as we shall 
see in § 124, in order to overcome difficulties in the older treatment of 
reversible adiabatic processes. 

112. The canonical ensemble as representing equilibrium for 

a system in a heat bath or in essential isolation 

(a) The canonical distribution. In accordance with the foregoing dis- 
cussion of systems in contact with sufficiently extensive surroundings, 
we have fotmd a tendency for the coarse-grained probabilities Pj, for 
different unperturbed energy eigenstates Js of the system proper to 
assume an ultimate, quasi-permanent distribution which would be 
described by ^ 

P* = Ge-^^, (112.1) 

where C and ]8 are constants, and Pg is the unperturbed energy for the 
state k. Such a distribution may be expected both in cases of thermal 
contact, where surroundings in the nature of a very large heat reservoir 
are purposely introduced, and also in cases of essential isolation. 

Taking cognizance of this jSnding, it wiU now be appropriate to choose 
as a suitable general representation for the condition of equilibrium — 
in such cases both of thermal contact and of essential isolation — the 
distribution precisely defined in the true energy language by the density 

Pnm = (112.2) 

where G and ^ are parameters having values independent of the state 
n, and the quantities P„ are the true energies for the various energy 
eigenstates n. Such a distribution — ^together with an appropriate dis- 
tribution to represent the surroundings giving a combined density matrix 
diagonal in the energy language — ^would, as we know, remain strictly 
constant in time and hence give as expected a description of equilibrium 
that would be strictly independent of time. Furthermore, jEcom the 

t The importance of the effect of the surroundings of a system on its equilibrium 
behaviour, even in those cases where we should ordinarily speak of the S3rstem as 
isolated, has also been especially emphasized from a somewhat different point of view 
by Epstein in his * Critical Appreciation of Gibbs’ Statistical Mechanics’ published in 
Commentary on the Scientific Writmga of J, Witlard Gfibbs, vol. ii, Yale University Press, 
1936 . 



502 


THE QUANTUM MECHANICAL H-THEOBEM Chap. XII 


close connexion between nearly steady states Te and true energy states 
n, the distribution defined by (112.2) would give coarse-grained proba- 
bilities for different states h in substantial agreement with (112.1). 
Hence, in such cases of thermal contact and of essential isolation, the 
distribution defined by (112.2) would meet all the appropriate require- 
ments for representing the condition of equilibrium, iucluding those 
which arise from treating this condition as that which would be ulti- 
mately expected in the course of time after an initial approximate 
measurement of the nearly steady states of the system of interest and 
of its surroundings. 

The distribution defined by (112.2) gives an ensemble which is the 
natural quantum mechanical analogue of the Mmonical ensemble of 
Gibbs, and is commonly denoted by this same name. By introducing 
new quantities defined by 

p = l and C7 = e^/», (112.3) 

U 


the distribution can also be expressed in the familiar form 

Prm = e « Km, (112.4) 

which has already been mentioned in § 83(c), and which will prove 
specially convenient in the next chapter in treating the relations be- 
tween statistical mechanics and thermodynamics. The relation of the 
constants j8 and 6 to the equilibrium temperature T of the system will 
be discussed later, from an immediate phenomenological point of view 
in § 114 (c) of the present chapter and from a thermodynamic point of 
view in the next chapter. 

In accordance with (112.2), the equilibrium probability for finding 
any given state n of energy will be given by 


■Pn = Pnn = <^6"^, (112.6) 

where, as mentioned in connexion with (110.3), it will now be appro- 
priate to drop the distinction between coarse-grained probabilities 
and fine-grained probabilities Summing over all states n, we can 


then write 


2 = 1 


( 112 . 6 ) 


as a valid expression of the normalization of the ensemble, and 

2 Ce-^^oEn = E (112.7) 

n 

as a necessary expression for the mean energy E of the members of the 
ensemble. This thus gives two equations for determiniag the para- 
meters C and With an appropriate selection for these quantities 



USE OF CANONICAL ENSEMBLE 


503 


§ 112 

we see that the ensemble would represent equilibrium for an essentially 
isolated system with any desired specihcation for its mean energy. 

(b) Justification for the canonical ensemble. The foregoing method 
of introducing the canonical ensemble — ^as giving a good statistical 
description of the ultimate equilibrium condition approached by sys- 
tems having sufficient interaction with their surroundings — is somewhat 
different ffiom the methods that have usually been employed in intro- 
ducing that ensemble as appropriately descriptive of equilibrium. It 
will hence be of interest to make a few remarks as to various arguments 
that have been and may be made in connexion with the introduction 
of the canonical ensemble. 

A somewhat common point of view has sometimes been that the 
condition of equilibrium for ordinary systems would actually be best 
represented by a microcanonical ensemble, with all its members having 
energies practically identical with some precise value ascribed as that 
of the system of interest. A justification for substituting a canonical 
ensemble of appropriate mean energy, in place of the microcanonical 
ensemble, has then been based firstly on the circumstance that a high 
concentration around the mean energy would make the average pro- 
perties of the canonical ensemble practically identical with those of a 
microcanonical ensemble in a case of many degrees of freedom, and 
secondly on the practical ground that the canonical distribution is 
mathematically the easier of the two to handle.f From our present 
point of view, such a justification for the canonical ensemble would 
seem quite roundabout, since the original selection of the micro- 
canonical ensemble as fundamental would appear based on ideas as to 
perfect isolation and as to the precise determination of energy which 
would not usually be appropriate. It is perhaps to be emphasized, 
however, in the ordinary cases of interest with systems of many degrees 
of freedom, that the two ensembles do lead to substantially the same 
conclusions as to the condition of equilibrium. 

A formal method of introducing a canonical distribution for the eneigy 
states TO of a system of interest can be obtained} by requiring aminimum of 
the quantity S defined in terms of energy states for the system itself by 

S = IP„logP„, (112.8) 

n 

t See, for example, Lorentz, AbJumcUungen liiber theoretiache Physik, Teubner, 1907, 
in particular § 79, p. 289. Compare also the remark of Gibbs, Mlementary Principles of 
StoHsticaZ Mechanics^ Yale University Press, 1902, p. 116. 

t See, for example, Pauli, Hamdbuch der Physik, xxiv/1, second edition, Berlin, 1933, 
p. 161. 



504 


THE QUANTUM MECHANICAL H-THEOREM Chap. XII 


under the subsidiary conditions of a constant total probability for one 
or another state 

n 

and of a constant value for the mean energy E 

= ( 112 . 10 ) 

n 

MaTring use of the usual methods of determining a conditional mini- 
mum, we should then indeed he led to the distribution 

( 112 . 11 ) 

with C and )8 constants, in agreement with that for a canonical en- 
semble. Furthermore, with the help of the concept of essentially 
isolated system, as discussed in § 111 (d), it will be seen that we can 
regard this method of attaining a canonical distribution as actually 
descriptive of the ultimate results which would be expected on the 
average in the case of a system left to itself in contact with sufficiently 
extensive surroimdings which could adjust the energy of the system to 
different values without affecting the mean value that would be found 
in successive trials. 

A different Mud of justification for the choice of the canonical en- 
semble as representing equilibrium was given by the discovery of 
Gibbsf that such a selection makes it possible to obtain very simple 
statistical mechanical analogues for the fundamental quantities and 
equations of thermodynamics. These analogues appeared to Gibbs him- 
self more satisfactory than those which he obtained from the choice of 
the microcanonical distributionj as a representation of equilibrium. As 
we shall see in the next chapter, it can now be made cleanly evident 
that the canonical distribution must be regarded as the one whicb. 
provides the appropriate mechanical analogues for the quantities and 
equations of thermodynamics. 

Somewhat deeper justification for the choice of the canonical en- 
semble as representing a system in thermal equilibrium was given by 
three farther discoveries by Gibbs. The first] 1 of these is the imme- 
diately evident result, see the form of (112.4), that a canonical distribu- 
tion for a. system consistiag of two parts implies a canonical distribution 
for each of those parts with the same value of the parameter 6, which, 
as we shall see later, can be related to temperature. This can be inter- 

t Gibbs, SHemmtary Principles of Statistical Mechames, Yale Uaiversity Press, 1902, 
p.44. 

i Gibbs, loc. cit., chap. ziv. 




(112.9) 


II Gibbs, loc. cit., p. 36. 



USE OF CANONICAL ENSEMBLE 


505 


§ 112 

preted as corresponding to the circumstance that a system in thermal 
equilibrium can be regarded as composed of parts which are themselves 
in thermal equilibrium with the same temperature as the whole. The 
secondf of these discoveries of Gibbs was that a microcanonical dis- 
tribution for a combined system consisting of a small system proper 
immersed in a large heat bath would imply a canonical distribution for 
the states of the system proper. This result agrees with our somewhat 
more general finding as to the ultimate condition of the representative 
ensemble for a system in contact with extensive surroimdings, and gives 
a similar justification for the canonical distribution as representing 
equilibrium. The thirdf of the above-mentioned discoveries of Gibbs 
was the finding that an arbitrary representative ensemble would assume 
canonical form with the expected value of 6, as a consequence of 
repeated interaction of its members with those of a canonical ensemble 
representing a large heat bath. We shall give a special demonstration 
of this principle in the next chapter, see § 128(c); the result may be 
interpreted as giving a satisfactory method of representing the process 
of temperature adjustment. In addition to these discoveries, it should 
also be mentioned, as emphasized by Gibbs,ll that the microcanonical 
ensemble could hardly be regarded as giving a satisfactory representa- 
tion for a system in equilibrium with a heat bath, since it would give 
no recognition to the fluctuations in energy which would actually be 
possible. 

It will be appreciated that our own method of introducing the canoni- 
cal ensemble — as representing the ultimate condition of equilibrium to 
be expected for a system in contact with a heat reservoir or in essential 
but not perfect isolation jfrom its surroundings — stands in no kind of 
conflict with the justifications for the use of that ensemble that were 
provided by the work of Gibbs. It may be emphasized, however, that 
our considerations give a more complete account of the processes by 
which equilibrium is achieved, and give due recognition to the possible 
effect of the walls and other surroundings in adjusting the energy of 
a system proper to different values. In the work of Gibbs the abstract 
idealization was definitely introduced that the walls of the container 
should have no effect.ft We shall see, nevertheless, in the next chapter 
in § 124 that the effect of the walls and other immediate surroundings 
must be definitely introduced in order to supply a deficiency in the 
Gibbs treatment of adiabatic processes. 

t Gibbs, loo. cit., p. 183. 

II Gibbs, loc. cit., p. 180. 

3595.25 


3T 


t Gibbs, loo. cit., p. 161. 
tt Gibbs, loc. cit., p. 164. 



606 THE QUANTUM MECHANICAL H-THEOREM Chap. XII 

In conclusion it may be well to emphasize that physical-chemical 
systems of interest, under usual conditions of laboratory or engineering 
practice, must be regarded as making sufficient contact with their sur- 
roundings so that canonical distributions wiU be appropriate in a funda- 
mental treatment of their conditions of equilibrium. In this connexion 
it is significant again to point out, in the case of a small system of 
interest having a small number of degrees of freedom, that the sur- 
roundings with which it must interact to secure substantial validity of 
the canonical distribution will themselves only have to be of moderate 
extent, and, in the case of a large system of interest of many degrees 
of freedom, that the differences between the microcanonical and canoni- 
cal distribution as a description of equilibrium tend to become unim- 
portant. It should also be noted again that it is very fortunate, from 
a computational point of view, that we can feel justified in selecting 
the canonical instead of the microcanonical distribution as giving the 
usually appropriate representation of equilibrium, since the calculation 
of mean values proves to be much easier in the canonical than m the 
microcanonical case. 

C. SPECmC EXAMPLES OF EQUILIBRIUM 

113. Equilibrium in Maxwell-Boltzmann systems 

We are now ready to make use of the canonical distribution, as giving 
an appropriate description of equilibiium imder ordinary circumstances, 
in order to treat some specific examples. We shall begin by considering 
a simple system of the Maxwell-Boltzmann type, which can be regarded 
as composed of n weakly interacting elements, having aimilnr pro- 
perties, but permanently distinguishable fix)m each other. As an 
illustration we might take a S3^tem of harmonic oscillators all having 
the same fundamental frequency but distinguishable from one A.Tint.TiftT- 
by permanent spatial location or orientation. 

We may regard the different energy states for a single element as 
designated by the value of an index or quantum number i and may 
take the corresponding energy for the element as denoted by the symbol 
Assu ming weak interaction between the elements, we can then t.n.Trft 
the possible energy states for the system as a whole as specified by a 
group of such quantum numbers for the n separate elements, 

and may take the total energy for such a state of the whole system 
as given by a sum of the form 

%+%+ — + ei^- 


(113.1) 



BOLTZMANN DISTRIBUTION 


607 


In acoordajaoe with our expression for the probabilities of different 
states in a canonical ensemble, as given by (112,6), we can then write 

(113.2) 

where G and j8 are constant parameters, as an expression for the equi- 
librium probability for any specified state of the whole system that 
we may wish to consider. 

Our actual interest is now going to lie in the probabilities for different 
states of the separate elements. Fixing our attention for the moment 
on the first of these elements corresponding to the index 1, and noting 
that the summation of (113.2) over aU possible states must lead to the 
value unity, we may then write 

Ge~P^*i ^ e~A*ti+™+*0 




(113.3) 


as an expression for the probability of finding element number 1 in the 
state ij. With the help of simple cancellations we then obtain 

P. - (113.4) 

T 

as a general expression for the equilibrium probability of finding any 
element — on which we wish to fix our attention — ^Ln the state i, the 
summation in the denominator being over all possible states i. It will 
be noted that such a probability for any particular element is indepen- 
dent of the states of other elements of the system. 

With the help of this result we may now investigate the mean number 
of elements Mi which would be found in a specified state i. In accotxiance 
with the independence of the probabilities for the states of the different 
elements, we can evidently write 

(Pi)"‘(l-Pi)™-^ (113.5) 

for the probability of finding a i^ecified set of % elements in the state 
i, with the remaining ti— % elements not in that state. Furthermore, 
such a set of Ui elements could evidently be chosen in 

— (113.6) 

equivalent different ways. Combining (113.6) and (113.6), we can then 
write 


P{ni) = 


f(Pi)»«(l-Pf)’‘-"« 


(113.7) 


' nil{n-ni)V ^' ' ' ' 

for the total probability P(«i) of finding precisely elements in state i. 



608 THK QUANTUM MECHANICAL H-THEORBM Chap. XII 


And this ■will of course permit us to compute the mean number in 
that state. 

The simplest way to perform the computation is to consider the 
summation of (113.7) over all possible values of from 0 to n. This 
gives us 


2 pw = 2 


n\ 


ni *=*0 




(Pi)««(l-P, = 1, (113.8) 


where the value unity arises in the first instance from the circumstance 
that the total probability for finding one or another number of elements 
in state i must e'vddently add up to 1. It will also be noted, however, 
that the value of this summation would be equal to unity -without 
reference to the value of since the second form of expression is seen 
to be the binomial expansion of [i^+(l— P^)]™ which is evidently unity 
for any value of i^. Hence, differentiating 'the second term of (113.8) 
■with respect to Pj, it -will be readily seen that we can obtain the equation 


2 [^P(%)-|f::^P(»ii)] = 0, (113.9) 

where we re-express the result of differentiating the two factors con- 
taining Pi in terms of P{ni) itself. Clearing of fractions and rearranging, 
this then gives us 

i P{ni)ni = nPi f P(«i). (113.10) 

tii— 0 nj=0 

Since P(«i) is the probability of finding elements in state i, the 
left-hand side of the above expression is seen to be the mean number 
of elements in state i, and the summation on the right-hand side is seen 
to reduce to the factor unity. With the help of the expression (113.4) 
already found for P^ we can then •write 

Mi — nPi 

or . Wi = n^^^ (113.11) 

< 

as the desired expression at equilibrium for the mean number of ele- 
ments in any particular state i that would be found in the members 
of a canonical ensemble representing the condition of equilibrium in a 
Maxwell-Boltzmann system composed of n similar but distinguishable 
elements. 

It is also easy to extend the use of the above apparatus to a calcula- 
tion of the deviations that can be expected from the mean number in 



BOLTZMANN DISTKIBUTION 


509 


§ 113 


any state. Differentiating (113.8) a second time with respect to we 
readily obtain after some simplification the result 

Mb® = 1-1 

fit n 


or 


Wf n 


(113.12) 


as an expression for the fractional mean square deviation from the 
mean number in any state i. It -will be noted that this becomes small 
for those states that axe highly populated. 

If we consider a group of states all of substantially the same energy 
an expression for the mean number of elements having this energy 
can be written, in agreement with (113.11), in the familiar form 


(113.13) 

where a is an appropriate constant. Comparing with our earlier equa- 
tion (89.5), we now see that our exact expression for the mean number 
of elements in any condition k in a canonical ensemble is the same as 
our previous approximate expression for the most probable number of 
elements in that condition in the corresponding microcanonical en- 
semble. Hence our previous results as to the equilibrium properties of 
Maxwell-Boltzmann systems, as obtained in Chapter X, can now be 
taken over without substantial alteration. 


114. Equilibrium in Einstein-Bose and Fermi -Dirac gases 
(a) Derivation of the distribution laws. We may now turn to the use 
of the canonical ensemble as representing equilibrium for a system 
composed of n weakly interacting particles, having similar properties 
and not permanently distinguishable from one another on account of 
their free motion inside a common container. Such a system will be an 
Einstein-Bose or a Fermi-Dirac gas, accordiog as the eigenfunctions 
which describe the states of the system are symmetric or antisymmetric 
to the interchange of particle indices as discussed in § 76. 

We may regard the different possible energy states for a single 
particle of the system in the container under consideration as denoted 
by the letters k, I, m, etc., and may take the corresponding mgenvalues 
of energy for the particle as denoted by ej., e;, e„j, etc. Assuming weak 
interaction between the particles, we can then take the possible states 
for the system as a whole as specified by the numbers of partides 
™ different elementary eigenstates, without any 



610 


THE QUANTUM MECHANICAL H-THEOBEM Chap. XII 


specification, as to which particles axe selected for the different states. 
I\irthennore, we can take the total energy for any such state of the 
system as given by a sum of the form 

for all n particles. In accordance with our expression for the proba- 
bilities of different states in a canonical ensemble, as given by (112.6), 
we can then write 

Pn^nun^... = (114.2) 

where C and ^ are constant parameters, as an expression for the equi- 
librium probability for any specified state of the whole system that we 
may wish to consider. 

Our actual interest is now going to lie in the probabilities for finding 
different numbers of particles in the different possible elementary 
states. S'ixing our attention for the moment on a particular one of 
these states, say k, we can evidently immediately write 

P(nii) = 2 (114.3) 

as a correct expression for the probability P{nji) of finding % particles 
in state k provided we take the indicated summation over all possible 
further assignments ?ij, etc., subject to the necessary restriction 

«;+»«+••• = »— %• (114.4) 


The restriction expressed by this relation makes any precise evalua- 
tion of (114.3) very difficult to carry out and dependent in a specific 
manner on the energy levels for the particles under consideration. This 
makes it advantageous to introduce a simple approximate method of 
treatment which, as we shall see later, proves to be closely correct. 
For this purpose we now replace our original expression (114.2), for the 
probabilities of different states in an ensemble of members each com- 
posed of a fixed number of particles », by an expression for the proba- 
bilities of different states in an ensemble of members containing all 
different possible numbers of particles n. To this expression we assign 

thefonn Pnt.nun^^= (114.6) 


where C, ot, and j8 are constants, and where we now regard the total 

number of particles , . , ^ 

n = nj^+ni+n^+... (114.6) 


as a variable which can assume any value from zero to infinity. 

We thus replace our original canonical emenible of members each 



§ 114 


EINSTEIN AND FEBMI DISTRIBUTIONS 


511 


composed of the same number of particles n hj eb so-called grand 
canonical ensemble, consisting of a weighted collection of canonical en- 
sembles, one for each possible value for the total number of particles n, 
and all with the same value for the significant parameter j8. Under the 
circumstances of ordinary interest, however, the members of such an 
ensemble will be highly concentrated in the neighbourhood of a parti- 
cular value for the total number of particles, as we shall show in the 
next part of this section. With an appropriate choice of the constant a 
this concentration can be made to occur at any value for the total 
number of particles n that we desire. In the present instance we regard 
this introduction of the grand canonical ensemble merely as a con- 
venient computational device. In a later place, § 140, we shall show 
that the grand canonical ensemble can actually have a theoretical 
significance under appropriate circumstances. 

Making a fresh start, with the help of the expression for the proba- 
bilities of different states in our grand canonical ensemble as given 
by (114.5), substituting for n from (114.6), and noting that the summa- 
tion of probabilities over all states must be equal to unity, we can 
now evidently write 

2 e-(a+i3€Oni-<a+i3€,„)nm-... 

•P(%) —■ Q 2 (114.7) 


as an expression for the probability of finding njf particles ia state k. 
And, since there is now no fixed total number of particles to put a 
hmitation on the indicated summations, we obtain by simple cancella- 
tion the desired simple expression 


r>/„ \ 

~ 2 g-<a+]8**)n» 
nt 


(114.8) 


for the probability of finding nj,. particles in a designated state k. 

We may evidently use this expression to determine the mean number 
of particles % or the mean square number (n^) that would be found in 
state k. It now becomes necessary, however, to give separate treat- 
ment to the Emstein-Bose case where there would be no limits on the 
possible number of particles in a given state in agreement with the 
character of S 3 nnmetric eigenfunctions, and to the Eermi-Dirac case 
where, as we know, the possible number of particles in a given state 
could only be zero or one, in agreement with the Pauli exclusion prin- 
ciple applying with antis 3 rmmetric eigenfunctions. 



612 


THE QUANTUM MECHANICAL H-THEOREM Chap. XII 


To perform the desired computations in the Einstein-Bose case, it 
proves convenient to consider the expression 






. (114.9) 


where the summation is taken over all values of from zero to infinity, 
in agreement with the considerations that the grand canonical ensemble 
places no limitation on the total number of particles in a system, and 
that the symmetrical character of the eigenfunctions places no limita- 
tion on the number in any particular state; and where the left-hand 
side is seen to be the correct binomial expansion of the right. Dif- 
ferentiating (114.9) with respect to (a-j-jSe*.), we obtain the result 

and differentiating a second time, we obtain 




(114.11) 


Furthermore, making use of the original expression (114.9), it will be 
seen that we can rewrite these results in the forms 


2 

n >£»0 


ea+^«lr_l’ 


and 


2 / 1 \a 

_■ ^ I g/ ^ \ 

I • 

njii—0 


(114.12) 


(114.13) 


Comparing with the expression for the probability P(«ji,) for finding 
particles in the state Te, as given by (114.8), we now see that (114.12) 
gives . 

% = ::siEr-r 


as an expression for the mean number of particles in state h. And 
we see that (114.13) gives 

( 1 ) = (114.15) 

as an expression for the mean square number in that state. This latter 
can be rewritten in the form 



EINSTEIN AND FERMI DISTRIBUTIONS 


513 


§ 114 


which gives an expression, for the fractional mean square deviation 
from the mean number in any state, that may be compared with the 
previous (classical) Maxwell-Boltzmann expression as given by (113.12). 
We thus obtain the desired results in the case of Einstein-Bose 
particles. 

In the Eermi-Dirac case the possible numbers of particles in any 
given state h could only be zero or one, as we have already noted. 
This makes it easy in this case to compute results analogous to those 
obtained above. Starting with the expression, given by (114.8), 




(114.17) 


for the probability of finding particles in state Ic^ the summation in 
the denominator will now include only the two terms for = 0 and 
= 1. Hence it will be immediately seen that we shall now obtain 
the same result for the mean and for the mean square number of 
particles in the state h 

= fi-(a+i8€*) 


These findings may be rewritten in the more familiar forms 

and = i--l. (114.20) 

We thus also obtain the desired results in the case of Fermi-Dirac 
particles. 

(6) Investigation of approximation. The foregoing expressions, both 
for the Einstein-Bose and for the Permi-Dirac case, were obtained with 
the help of a grand canonical ensemble containing members with ail 
possible values for the total number of particles n. We must now 
demonstrate the validity of this method of proceduref by showing for 
the ordinary cases of interest that there would be a high concentration 
of the members of the ensemble around the most probable number of 
particles fi. 

Por this purpose we may start with our expression (114.6), 

(114.21) 


3696.26 


t Compare Pauli, Phya, 41, 81 (1927). 

3 tr 



su 


THE QUANTUM MECHANICAL H-THEOBEM Chap. XII 


for the probability of finding a member of the grand canonical ensemble 
in the indicated state. With the help of this formula we can then 
evidently write 

P(n) = Ce-°‘» 2 (114.22) 

as an expression for the probability of finding a member of the ensemble 
with any selected value for the total number of particles n, provided 
the summation is taken as indicated over all possible states for a system 
composed of that number of particles. It will be convenient to rewrite 
this expression in the abbreviated form 

P(n) = (114.23) 

where y is defined as a function of n by 

g-y(n) — 2 (114.24) 

Differentiating (114.23) with respect to n, we obtain 

= — (7e-I“»+)<»»~[a7i+y(«)] = 0 (114.26) 

on on 


as the condition for a maximum value of the probability P(n), Hence, 
if we develop oin+y(n) around the most probable value ^ for the total 
number of particles, we see that the second term of the expansion will 
be missing, and that we shall obtain 


cm+y(n) = 


(114.26) 


as an appropriate approximation. Substituting in (114,23), we then 
PW = (114.27) 


as an approximate expression for the probability of finding a member 
of the ensemble with a total of n particles. As a suitable approximation 
for the mean square deviation we shall then have 


J P(n)(n-n)^ dn je dn 

J P(n) dn f 

0 0 


(114.28) 


Changing the variable of integration to (n—fi), and noting with it large 
that it "will then be a good approximation to take the limits of integra- 



EINSTEIN AND FERMI DISTRIBUTIONS 


515 


§ 114 


tion as numing from minus to plus infinity, we then have 


+ 00 

h 




(»—»)? d{n—fl) 






(114.29) 


which from known formulae of integration (Appendix II) gives us 

1 


(«— fi)® = 


(114.30) 


as the desired result. 

To investigate the magnitude which can be expected for this mean 
square deviation, we may turn to the definition of y{n) as given by 

(114.24), c-y= 2 (114.31) 

7l4+Wj + ...=n 

and rewrite this for any value of » in the form 


c-y 


= 2 




tuU<—tn 


ra! 


1 g—^(€l+f*+...+*»)^ 


(114.32) 


where we take a sum over aU possible values e^, ej)— j for the energies 
of n particles. The symbols «*, % in this expression denote the 
numbers of particles in each group to which the same elementary state 
is assigned in any particular term of the summation. Hence the factor 
introduced in front of the exponential term correctly allows for the 
circumstance that a mere interchange in particles does not lead to a 
different state of the system as a whole. 

At a later time (see, for example, § 136 (a)) we shall have occasion to 
obtain explicit evaluations for expressions of the form (114.32), to which 
the name ‘sum-over-states’ may be applied. For our present purposes 
it is merely necessary to call attention to the appearance of factorial n 
in the denominator of each term of the summation. Using the Stirling 
approximation for factorial numbers, and taking logarithms, this then 

y = «log»+.... (114.33) 

as the ‘leading term’ of y. And by differentiation we then see that the 
denominator in (114.30) will at least have the order of magnitude 



(114.34) 


in the absence of fortuitous cancellations. Combining with (114.30), we 
can then take /,. .■i \9 ■, 



516 


THE QUANTUM MECHANICAL H-THEOREM Chap. XII 


as giving an idea of the order of magnitude of the fractional mean 
square deviation from the most probable number of particles for the 
members of our grand canonical ensemble. Since the fractional devia- 
tion goes to zero as the most probable number of particles ^ becomes 
large, this then justifies our approximate method of calculation in the 
case of ordinary physical-chemical systems of interest where the actual 
number of particles is exceedingly high. 

A somewhat different method of showing the tendency for the mem- 
bers of a grand canonical ensemble to be concentrated in the neigh- 
bourhood of a particular composition will be discussed in § 141 (e). 

(c) Further discussion of the Einstein-Bose and Fermi-Dirac distribu- 
tion laws. For purposes of comparing the results of this section with 
our previous equilibrium results as obtained in Chapter X, it will now 
be convenient to fix our attention on any selected group of ele- 
mentary states all of which may be regarded as corresponding to sub- 
stantially the same value of energy In accordance with (1 14.14) and 
(114.19), we can then write 


as a convenient combined expression for the mean number of particles 
that would be present at equilibrium with the energy e^, where the 
upper sign refers to the Einstein-Bose and the lower to the Fermi-Dirac 
case. Comparing with our previous equation (89.8), we now see that 
our closely exact expression for the mean number of particles in any 
condition ic in a canonical ensemble is the same as oux previous approxi- 
mate expression for the most probable number in that condition in the 
corresponding microcanonical ensemble. Hence our previous results as 
to the equilibrium properties of Einstein-Bose and Fermi-Dirac systems, 
as obtained in Chapter X, can now be taken over without substantial 
alteration. 

Among these earlier results we obtained in § 91 (6) the important 
possibility of relating the parameter )3 — ^appearing in the expressions 
for the Maxwell-Boltzmaim, Einstein-Bose, and Fermi-Dirac distribu- 
tions — to the temperature T of the system by the relation 




JcT’ 


(114.37) 


where k is Boltzmann’s constant. It will be immediately evident that 
this possibility stiU persists in our present mode of treatment. From the 
general form of expression. 




(114.38) 



§ 114 EINSTEIN AND EERMI DISTRIBUTIONS 517 

for the distribution of states in a canonical ensemble, it will be seen, in 
the case of a system consisting of parts to which separate energies can 
be ascribed, that each part can be regarded as represented at equili- 
brium by a canonical distribution with the same value of j8. This then 
makes it possible, as before, to take a part of any system of interest 
as consisting of a gas sufficiently dilute, so that it can be treated as 
a classical perfect-gas thermometer, and so that the distribution of 
particles described by (114.36) will assume its classical 'form. Using 
this distribution, it then again becomes possible to obtain the previous 
relation of the constant )3 to the pressure •p and hence to the pheno- 
menologically defined temperature T of the dilute gas. 

To conclude this discussion of equilibrium in Einsteiti-Bose and 
Fermi-Dirac gases, this may be taken as a convenient place to consider 
a question, arising in the case of gases composed of real molecules, 
which was not discussed in Chapter X. In the above derivation of the 
Einstein-Bose and Eermi-Dirac distribution laws we have spoken of 
the gases for simplicity as consisting of particles. It is evident, how- 
ever, that the derivation given would also apply to gases composed of 
real molecules, provided the dilution is high enough so that the inter- 
action between molecules can be treated as weak, provided we use the 
Einstem-Bose or Eermi-Dirac distribution according as the molecules 
are composed of an even or odd total number of fundamental particles 
(nuclei being treated as composed of protons and neutrons), and finally 
provided we recognize that the different states of the molecules to 
which the distribution laws apply must be specified by giving their 
internal as well as their external state. This latter circumstance now 
makes it of interest to inquire into the relative numbers of molecules 
that would be in different internal states without reference to their 
external state of motion. 

To examine this question it now becomes convenient to regard the 
different possible states of a molecule as specified by a pair of indices 
lei referring to external and internal conditions respectively, and to take 
the total energy of the molecule in any such state as given by the 
sum of its external kinetic energy e* and its internal energy e^, 

(114.39) 

In the classical mechanics this additivity of energies woidd lead to 
statistical independence of the distributions over states le and states i, 
and hence to the fanuliaj Boltzmann ratio for the numbers of molecules 
in different internal states i. We shall see, however, that for degenerate 



618 THE QUANTUM MECHANICAL H-THEOREM Chap. XII 

Einstein-Bose and Eenni-Dirac gases the internal and external dis- 
tributions are not independent. 

In accordance with (114.14) and (114.19), we may write 

~ (114.40) 

as an expression for the mean number of molecules which would be in 
any such state hi at equilibrium, where the upper sign refers to the 
Einstein-Bose and the lower to the Fermi-Dirao case. Furthermore, in 
accordance with (71.16), we may write 

dg = ^m^{2me,c)dek, (114.41) 


as an expression for the number of eigenvalues of kinetic energy which 
would fall in any range to where v is the volume of the 

container and m the mass of a molecule. Combining with (114.40), we 
can now write 


Mi = ^mV(2m) 


e[0i+jS«d+jSc»ip 1 


(114.42) 


as an expression for the mean number of molecules in the internal state 
i and in the external energy range dek- 

Considering the integration of this expression orer all possible values 
of the kinetic energy c*. from zero to infinity, making use of our previous 
expressions for the result of such integrations as given by (93.12) and 
(94.7), and replacing /S by IjkT, we now obtain 


and 


- _ v{2^kT)i 


(114.43) 


as respective expressions in the Einstein-Bose and Fermi-Dirao cases 
for the mean number of particles in any internal state i. The functions 
V and F in these expressions are the definite integi'als originally intro- 
duced in §§ 93 (6) and 94 (a) for the treatment of the two cases. It will 
be noted, however, that they now depend on the argument [a-\-eilkT] 
instead of simply on oc as in orur earlier considerations. We may make 
use of these expressions to obtain the desired information as to the 
relative numbers of molecules that would be present in difrerent intemsfl 
states i. 

Considering cases where the gas degeneration is sufficiently small to 



EINSTEIN AND FERMI DISTRIBUTIONS 


519 


§ 114 


permit the use of the series expansion for U and V, which are given 
by (93.9) and (94.3), and fixing our attention on any pair of internal 
states i and j, we readily obtain 




(114.44) 


where the upper and lower signs refer respectively to the Einstein-Bose 
and Eermi-Dirac cases, as an expression for the ratio of the equilibrium 
numbers of molecules that can be expected in two different internal 
states i and j at the temperature T. It is of interest to note that this 
approaches the familiar Boltzmann ratio, 




(114.46) 


as a gets larger and the degree of gas degeneration gets smaller. For 
all ordinary gases under laboratory conditions, a is large enough so that 
deviations from the Boltzmann ratio are quite negligible. This con- 
clusion is of great importance in the study of physical-chemical equi- 
libria.f 

A case, however, where the general formula (114.43) leads to results 
radically different from the Boltzmann ratio is provided by Pauli’s 
treatment of the paramagnetic susceptibility of metals which has 
already been mentioned in § 94(e). Treating the conduction electrons 
of a metal as a Fermi-Dirac gas, the second form of (114.43) may be 
used to compute the ratio of the numbers of electrons with magnetic 
moment oriented parallel and antiparaUel to an external field. Because 
of the high degeneracy of the electron gas at ordinary temperatures, 
this ratio is much nearer unity than the Boltzmann ratio and gives 
correspondingly smaller paramagnetic susceptibility. 


115. Equilibrium in general in physical-chemical systems 
In accordance with the relation (114.37) between the parameter j8 
and the temperature T, ■, 

which was seen to be justified in the preceding section, we may now 


t Deviations from the Boltzmann ratio have been discussed for the Einstein-Bose 
case by Lewis and Mayer, Proc. Nat. Acad. 15, 208 (1929), and for the Fermi-Dirac case 
by Brillouin, Lea Statiatiquea Qiiantiguea, Paris, 1930, p. 160. 



520 


THE QUANTUM MECHANICAL H-THEOBBM Chap. XII 


express the ccmonical distribution (112.6) for the equilibrium proba- 
bilities Pn for the various energy states of any system in the form 

(116.2) 

where (7 is an appropriate normalizing factor. It will be appreciated 
that this now provides a fundamental apparatus for studying the 
equilibrium properties, at a specified temperature T, for any kind of 
system for which we can determine the energy states n corresponding 
to different possible eigenvalues 

It is of interest to compare the above distribution with the micro- 
canonical distribution (110.3), for the probabilities of different energy 
states, which can be written in the form 


( pq in range E to E-\-BE) 

I 0 {En not in range E to E+SE), 


(116.3) 


where is a constant. This provides an apparatus, as was discussed 
in § 110, for studying the equilibrium properties of a system of specified 


energy. 

In investigating the ordinary problems of physical-chemical equilibria 
which arise, it is usually preferable to employ the apparatus provided 
by the canonical distribution (116.2) rather than that provided by the 
microcanonical distribution (115.3). This is due in the first place to 
the consideration that we are usually interested in systems which have 
sufficient contact with their surroundings so that they can be better 
regarded as having a specified temperature than as having a specified 
energy; and is due in the second place to the circumstance that it is 
much easier to give accurate mathematical treatment to the canonical 
than to the microcanonical distribution. Nevertheless, as previously 
remarked and as already illustrated by examples, the two methods 
lead to substantially identical r^ults in the case of systems of many 
degrees of fireedom, since the canonical distribution then becomes highly 
concentrated around a particular value of the energy, in agreement with 
our physical experience that the fluctuations in energy are unimportant 
for an ordinary macroscopic system immersed in a temperature bath. 

With the help of the canonical distribution described by (116.2), it 
would now be possible to proceed at once to further studies of equilibrium 
phenomena, mcluding equilibrium between regions of different potential 
energy, equilibrium between condensed and gaseous phases, and equi- 
librium in chemical reactions. We shall regard it as preferable, how- 
ever, to postpone such studies until after we have presented a statistical 
mechanical explanation of the principles of thermodynamics. This will 



§ 116 GENERAL TREATMENT OF EQUILIBRIUM 521 

make it possible to introduce the idea of temperature into our con- 
siderations from a more fundamental thermodynamic point of view, 
and to express the results obtained in the kind of language commonly 
employed by physical chemists. 

116. The principle of detailed balance in the quantum mechanics 

We now conclude the present chapter by using the studies, which 
we have made in connexion with the if -theorem, to throw light on the 
mechanism responsible for the maintenance of steady properties m 
the case of a quantum mechanical system in a state of macroscopic 
equilibrium. It will be found that the maintenance of such properties 
can be regarded as guaranteed by a principle which is the quantum 
analogue of the classical principle of detailed balance, which was dis- 
cussed in § 60 (c) of Chapter VT. To carry out the investigation we shall 
consider the frequency of transition between different conditions of the 
system of interest, making use of the method of transition probabilities 
as giving a good insight into the mechanism involved, and as being 
sufficiently exact for that purpose. 

Let us fix our attention on two possible conditions of the system of 
interest — of, if necessary, of the system proper and its surroimdings — 
which we denote by the letters k and v. Let us regard these conditions 
as defined by two groups of neighbouring nearly steady states, and 
(?„ in number, which lie in the same energy range E to E+AE, but 
which correspond to observably different situations. Let us then regard 
ourselves as making observations on such conditions, when the system 
is in equilibrium, in order to determine the probabilities for transition 
between them. 

In accordance with the oiroumstanoes, that equilibrium for the system 
corresponds in any case to an ensemble giving the same probabilities 
for diff erent states of the same energy, and that our observations are 
made on nearly true energy states, we can take the probabilities and 
Py for finding the two conditions as proportional to the corresponding 
nmnbers of states and (?„. This then permits us to write 



for the ratio of the two probabilities. Furthermore, in accordance with 
the circumstances, that representative ensembles for systems in equi- 
librium are diagonal in the energy language, and that our observations 
would give no information as to the phases of the states in or Gy, 

35W,2S. 3 X 



522 THE QUANTUM MECHANICAL H-THEOREM Chap. XII 

we have the necessaiy conditions for applying the method of transition 
probabilities, and can take our previous expressions (99.42) and (99.43), 

= (116.2) 

and (116.3) 

with A^y = Ay^ (116.4) 

as giving the numbers of transitions per unit time, Z^y and Zy^, which 
we can expect from condition k to v and from vto k respectively. Com- 
bining the foregoing equations, we can then obtain 

Z^ = Zy^. (116.6) 

This result shows that we may expect, on the average, for a system at 
equilibrium the same frequency of transition from the condition «: to 
as from the condition v to k. This then means that we can regard the 
steady state of affairs at equilibrium as maintained by a direct balance 
between the rates of opposing processes; that is, the transitions from 
K to V do not have to be thought of as balanced with the help of some 
indirect route such as v to A to k. This is the quantum mechanical form 
of the principle of detailed balance. 

The derivation of this principle in the quantum mechanics, as given 
above, is seen to be rather more general and more direct than the 
corresponding classical derivation as given in Chapter VI. This arises 
partly from the general character of the formulation in terms of transi- 
tion probabilities (cf. § 99), which makes it unnecessary to consider the 
special nature of the processes taking place in the system of interest. 
In addition, the early introduction of statistical assumptions, especially 
the assumption of random phases as a necessary prerequisite for the 
formulation in terms of transition probabilities, may be regarded as 
including that grouping of states with their reverse states, which was 
taken as a necessary special step in the classical treatment of this 
question. 

The principle of detailed balance is often very useful in obtaining an 
insight into the behaviour of complicated systems. A specific applica- 
tion, which is of use in the theory of chemical kinetics, will make its 
importance clearer. Consider a gas containing molecules which ngn 
exist in various states of kinetic energy and internal energy e^, and 
let the condition of the gas as a whole be specified by the numbers of 
molecules of the different kinds present in the different individual 
states which are possible. Consider, then, two conditions of the gas 
which differ in that the first condition has a pair of molecules with 



DETAILED BALANCE 


623 


§ 116 

high kinetic energy and low internal excitation, and the second has 
a corresponding pair of molecules of average kinetic energy, one of the 
molecules, however, now being in a highly excited internal state so that 
we can regard it as chemically activated for some reaction of interest. 
Assuming coUisional mechanism for the transitions between these two 
conditions, the principle of detailed balance will then allow us to equate 
the numbers of collisions in unit time which would lead at equilibrium 
to the inverse transitions between the above two conditions. This is 
of importance in chemical kinetics since it permits a determination of 
the rate of chemical activation from the more easily calculated fre- 
quency of deactivating collisions which would prevail at equilibrium. 



xni 


STATISTICAL EXPLANATION OE THE PRINCIPLES 
OE THERMODYNAMICS 


117. Introduction 

(o) Thermodynamic system and representative ensemble. In the pre- 
sent chapter we are now ready to undertake the important task of 
uaiug the methods of statistical mechanics to provide an explanation 
for the principles of thermodynamics. The methods of statistical 
mechanics are specieilly devised for treating the behaviour, which can 
he expected on the average, in the case of mechanical systems in con- 
ditions corresponding to an incomplete specification of precise state. 
And the principles of thermodynamics axe devised for giving a pheno- 
menological account of the gross behaviour of macroscopic physical 
systems in conditions corresponding to the specification of a limited 
number of thermodynamic variables such as volume, pressure, energy, 
temperature, or entropy. The desired explanation of thermodynamics 
depends on showing that the science of statistical mechanics provides 
an appropriate interpretation for such non-mechanical variables as 
temperature and entropy, and provides predictions as to the average 
behaviour of systems of many degrees of freedom m substantial agree- 
ment with the predictions of thermodynamics. 

In accordance with the methods developed in preceding chapters, 
the properties of a thermodynamic system, whose condition is described 
by the values of a limited number of thermodynamic variables, may 
be studied with the help of the average properties of an appropriately 
chosen representative ensemble of systems, of similar constitution to the 
one of actual interest. In a general way it may be said that the appro- 
priate choice of representative ensemble depends on taking a distribu- 
tion of the members of the ensemble over their different possible 
individual states, which agrees, on the one hand, with our knowledge 
of the thermodynamic variables that have been measured, and which 
conforms, on the other hand, with the hypotheses of equal a priori 
probabilities and of random a priori phases on which we have based 
our deductions of statistical results. 

(6) The nature of thermodynamic variables. We must now consider 
the nature of the therrnodynoi/mic variables, the values of which deter- 
mine the selection of an appropriate representative ensemble. These 
variables may be conveniently grouped into three classes. 



§ 117 


THEBMODYNAMIC VABIABLES 


625 


As the first class of such quantities we shall take external parameiers, 
having values which we regard as definitely established by external 
agencies without reference to the internal condition of the system of 
interest. They include typically such quantities as the volume of the 
system determined, for example, by the position of a piston, the in- 
tensity of fields of force acting on the system due to the presence of 
neighbouring bodies, and the coordinates locating the position of heat 
reservoirs which we may wish to bring into thermal contact with the 
system of interest. These external parameters depend either directly 
or indirectly on the quantities designated as external coordinates by 
Gibbsf in the classical development of statistical mechanics. As in the 
classical statistics, we can take the values of these parameters as pre- 
cisely determined. This is made possible, in the first place, as in the 
classical development, by the consideration that these quantities are 
not dependent on the imperfectly known precise internal state of the 
thermodynamic system of interest; and is made possible, in the second 
place, for the purposes of a quantum treatment, by the consideration 
that the external parameters will be chosen so as to depend on the 
gross properties of macroscopic external bodies, which are imaffected 
by the limitations imposed by the Heisenberg uncertainty principle. 
Owing to this precise determination it wUl be possible to assign the 
same values, as are exhibited by the system of interest, to the external 
parameters of all the systems in the representative ensemble. 

As the second class of thermodynamic variables we shall take 
mechanical quantities, having values dependent on the condition of the 
system, which we regard as determined by gross measurements that are 
not sufBicient to give a knowledge of the precise state of our system. 
As typical examples of such variables we have the pressure exerted by 
the system and its total energy content. In general, the values of these 
mechanical quantities cannot be regarded as precisely determined for 
the particular thermodynamic system of interest. In this connexion, 
however, we find a certain difference between the classical point of 
view and the quantum mechanical point of view which we must now 
adopt. In the classical statistics a quantity such as pressure was taken 
as necessarily somewhat indeterminate since the instantaneous force 
exerted on the walls of a container by the molecules composing a system 

t Gibbs, EUrmniary Prmcfiplea in StaMsticc^ Mechanics, Yale University Press, 1902. 
See, for exajziple, chapter xiii, p. 162. It will be noticed that the treatment given in 
the present chapter is in many places merely the quantum mechanical transcription 
of the oletssical treatment of the relation between thermodynamics and statistical 
mechanics given by Gibbs. 



626 EXPLANATION OF THBBMODYNAMICS Chap. XIII 

would be subject to fluctuations, depending on the unknown values of 
internal coordinates and momenta. But a quantity such as the total 
energy of a system was regarded as subject in principle to precise deter- 
mination without important effect on the subsequent behaviour of that 
system. In the quantum statistics, however, we shall generally have 
to regard both the pressure and the energy of a system as not precisely 
determined. In the case of pressure, this will be true not only for the 
classical reason that the necessary knowledge as to internal coordinates 
and momenta is lacking, but also for the added quantum mechanical 
reason that the necessary knowledge as to the simultaneous values of 
such complementary variables could not even be theoretically obtained. 
And in the case of energy we shall have to take this quantity also as 
only approximately determined in general, since its precise measure- 
ment would involve an infinite length of tune and would put the system 
in a steady quantmn mechanical state, thus drastically affecting its 
subsequent behaviour. In agreement with their approximate deter- 
mination, we shall represent the values of the mechanical quantities 
characterizing a thermodynamic system by the average values of these 
quantities in the representative ensemble. 

As the third class of thermodynamic variables we shall take the 
essentially rmi-mecTuinioal thenmd/ynamic quantities, which the science 
of thermod 3 naamics has specially introduced on its own level of scientific 
abstraction for the treatments which it gives. These quantities are 
temperature and entropy, and the various subsidiary variables which 
can be defined with their help. We shall find later (§ 122) that the 
statistical mechanical analogues of temperature and entropy are pro- 
vided by quantities which characterize the distribution of the repre- 
sentative ensemble appropriate for the thermodynamic system of 
interest. Further discussion of these quantities will have to be post- 
poned until that time. 

(c) Energy, work, and heat. For the piuposes of thermodynamics it 
has been found essential to distinguish two different kinds of process 
by which the internal energy of a system may be changed. These pro- 
cesses are the performance of work by the system on its surroundings 
and the flow of heat ficom the surroundings into the system, positive 
and n^ative values being possible for either of those quantities. 

The concepts of the wd&nuxl en&rgy of a thermod 3 mamic system, and 
of the tra/asfer of energy between system and surroundings, presuppose 
a sufficiently distinct, actual or ideal, boundary between system and 
surroundings so that the energy belonging to the system can be dis- 



WOEK AND HEAT 


527 


§ 117 

tinguished — at least with suitable approximation — ^from that of the sur- 
roundings. Under these circumstances the internal energy of the system 
of interest can then be correlated, as noted above, with the average 
energy of the systems in the appropriate representative ensemble. 

The concept of the work performed by a thermodynamic system on 
its surroundings depends on the possibility of changing the values of 
external parameters for the system, which have the nature of general- 
ized coordinates. In the thermodjmamic treatment, when a small 
variation is made in the value of such an external coordinate, the work 
done can be set equal to the product of that variation by the corre- 
sponding generalized force exerted by the system on its surroimdings,^ 
and can be regarded as equal to the (potential) energy thereby trans- 
ferred to external bodies. As a simple t3^ieal example, when the 
volume of a system is varied by the amount Sv, we set the work done 
and energy thereby transferred equal to p 8 v, where the generalized 
force p is the pressure exerted by the system. In the corresponding 
statistical treatment we can assign to aU the systems in the representa- 
tive ensemble the same values for the external coordinates as those 
of the thermodynamic system of interest, and can represent the 
corresponding generalized forces by their average values in that 
ensemble. The work done and the energy thereby transferred to 
external bodies can then be represented by its average value for the 
ensemble, as given by the product of the definite value of the displace- 
ment with the average value of the corresponding generalized force. 

The concept of the heal transferred between a thermodjmamic system 
and its surroundings depends on the notion of thermal contact of the 
system with a so-called heat reservoir, the transfer of energy then 
taking place in a manner which cannot be followed in detail by the 
macroscopic measurements of thermodynamics since it is dependent 
on the unknown internal states of the two bodies involved. In the 
statistical mechanical treatment of such processes it is evident that we 
shall in general have to introduce two representative ensembles, one 
to correspond to the thermod3mamic system of interest and the other to 
the heat reservoir. By considering each system in the ensemble for the 
thermodynamic system in thermal contact separately with each system 
in the ensemble for the heat reservoir, we can now create a third 
ensemble — of higher order in the number of systems — ^to correspond to 
the process of heat transfer. We may then represent the heat flow into 
our thermodynamic system by the average energy decrease in the heat 
reservoirs in this third ensemble. 



fi28 


EXPLANATION OF THBBMODYNAMICS Chap. XIII 


118. The energy principle for ensembles 
We may now prepare the way for a presentation of the statistical 
interpretation of the familiar first law of thermodynamics by con- 
sidering the quantum mechanical analogue of the energy principle as 
applied to an ensemble of systems. In its quantum form this principle 
requires, in the case of an ensemble of isolaied systems, that the proba- 
bility for finding a member of the ensemble with any specified value 
of energy should remain constant independent of time. As a particular 
consequence of this principle it is then evident that the mean energy 
for the members of such an ensemble would be constant in time. This 
restricted form of the principle wUl be of special interest to us in the 
explanation of thermodynamic phenomena. 

It is evident that the truth of this principle is immediately guaranteed 
by the laws of the quantum mechanics itself, since these have shown 
that the mean or expectation value of the energy for each individual 
system in the ensemble would itself be independent of the time. Never- 
theless it may be instructive to give a derivation of the principle with 
the help of the formalism that has been specially devised for the treat- 
ment of statistical mechanical problems. 

In accordance with (78.9), we can write 

^ = I HmnPnm ( 118 . 1 ) 

n,m 

for the mean energy of the systems in an ensemble, where the are 
elements of the matrix oorrespondii^ to the Hamiltonian operator H, 
and the are the elements of the density matrix which describe the 
distribution of the ensemble. Furthermore, for the rate of change in 
the density matrix with time we can write, in accordance with (81.20), 


k 

Combining (118.1) and (118.2), we then obtain 

~ 2 i^mn^nkPkm ^km-^mnPnk) 

m,n,k 


(118.2) 


= 0 , 


A 


^nk Pkm ^nk Pkir^ 


(118.3) 


where the second form of writing results from the possibility of changing 
our choice of letters for dummy indices over which we take a sum- 
mation. 



§ 118 ANALOGUE OF FIKST LAW 629 

We thus see indeed that the mean energy in an ensemble of isolated 
systems would be independent of the time. 

119. The analogue of the first law of thermodynamics 

We are now ready to consider the first law of thermodynamics and 
its statistical mechanical analogue. In the classical thermodynamics 
this law was customarily written in the form 

^E = Q^W, (119.1) 

where is the increase in the energy of a system that accompanies 
the absorption of heat Q from the surroundings and the performance 
of work W on the surroundings. It may be regarded as an expression 
of the classical principle of the conservation of energy written for the 
purposes of thermodynamics in a form which distinguishes the two 
different modes of energy transfer, between surroundings and system, 
denoted by the terms heat and work. 

In order to obtain the analogue of the above equation in a quantum 
statistical mechanics, it is evident that the quantities appearing therein 
will have to be represented by averages provided by ensembles of 
systems appropriate for representing the processes of transfer of heat 
and performance of work. The considerations of the two preceding 
sections have provided the necessary concepts for obtaining the desired 
modified form of equation. 

The thermodynamic system itself, together with the external bodies 
on which it can do work, may be represented by an ensemble of systems 
having the mean internal energy in the parts corresponding to the 
thermodynamic system proper and the mean energy ^ in the bodies 
external thereto. Furthermore, the heat reservoir involved in the pro- 
cess may be represented by an ensemble of systems having the mean 
energy E^- By considering a representative for each member in the 
first ensemble as placed in appropriate thermal contact separately with 
each member of the second, we can then construct a new ensemble 
suitable for studying both the performance of work and the transfer of 
heat. The mean energy of this final ensemble can evidently be written as 

-^12 = (119.2) 

Each member of this final ensemble, however, is itself an isolated 
system composed of one representative for the thermodynamic system 
proper, one for the bodies external thereto upon which work can be 
done, and one for the heat reservoir. Hence, in accordance with the 

8696.26 Q V 



630 


EXPLANATION OF THBBMODYNAMIOS Chap. XIII 


precedii^ section on the energy principle for ensembles, the mean 
energy cannot change with the time, and we shall have 

Afia = Afi+AFi+Afa = 0 (119.3) 

for any processes that take place. 

This result can now be rewritten in the desired form — analogous to 
the usual first law equation — 

(119.4) 

where LE = AJ^i is the mean increase in the internal ^ergy o^the 
representatives of the thermodynamic system proper, Q = is 

the mean energy transferred from the heat reservoirs, and W = A^ 
is the mean work done on external bodies. 

In -makiTig this interpretation of the first law of thermod3Tiamics, 
it will be noted that we have actually selected the mean values of 
quantities in an appropriate representative ensemble as the kind of 
averages to be correlated with the precise values which would be assigned 
to such quantities in thermodynamic considerations. It would, of 
course, also be possible to select some other kind of average instead 
of the mean for this purpose. Nevertheless, in the typical situations 
to which thermodynamics may be applied, the different common kinds 
of average which might be selected are actually found to have sub- 
stantially identical values on account of the negligible character of 
fluctuations around the mean. It is hence possible and proves con- 
venient, as we shall see also in the case of the second law of thermo- 
dynamics, to confine attention to simple arithmetical means as the 
averages to be used in the interpretation of thermodynamic relations. 

It is of interest in this connexion to remark that the success of the 
thermodynamic procedure, of assigning exact values to quantities which 
cannot really be regarded as precisely defined, depends on the actually 
negligible character of the fluctuations which can be expected in the 
values of such quantities in a series of repetitions of the same thermo- 
dynamic experiment. When this condition is not satisfied, the methods 
of thermodynamics must be suitably modified, as will be discussed in 
the next chapter in § 141. 

120. The canonical ensemble as representing thermod 3 mamic 

equilibrium 

We must now rmdertake a series of statistical mechanical investiga* 
tions which will ultimately put us in a position (§ 130) to state the 
statistical mechanical analogue of the second law of thermodynamics. 



§ 120 REPRESENTATION OF EQUILIBRIUM 631 

The first problem to be treated is that of inTestigating the appropriate 
ensemble for representing a system in thermodynamic equilibrium. 

In picking out a suitable ensemble for this purpose, it is to be noted 
that the systems, ordinarily treated by thermodynamic methods, are 
in actuality never in perfect isolation, but are either purposely placed in 
thermal contact with some other system such as a heat reservoir, or 
are necessarily in contact with their immediate surroundings such as 
the walls and piston of a cylinder. Hence, in accordance with §§111 
and 112, see in particular the discussions in §§ 111 (d) and 112(6), it is 
evident that the condition of thermodynamic equilibrium for the systems 
of usual thermodynamic interest can be best represented by a canonical 
ensemble, since this has been found to give the most appropriate descrip- 
tion of equilibrium in the case of systems in thermal contact with their 
surroundings or in essential rather than perfect isolation therefrom. 

In agreement with (112.4), we may write the density matrix for a 
canonical ensemble, when expressed in the eneigy language, in the form 

i>—Bn 

Pnm = e ^ (120.1) 

where ifj and 6 are constants and is the energy eigenvalue for the 
state n. In accordance with this density matrix we may then take 

J?* = />»« = e e (120.2) 

as an expression for the equilibrium probability of finding the system 
of interest in any particular state n, on successive repetitions of the 
same condition of equilibrium. Since such probabilities could be deter- 
mined, in the case of equilibrium and of true energy states, with any 
desired degree of accuracy by separating the system from its surroimd- 
ings and taking sufficient time for the observation of its state, we regard 
ourselves as proceeding to the limit where the observed or coarse-grained 
probability P„ can be set equal to the fine-grained probability 
Important properties of the canonical distribution are expressed by 
the following equations. Since the total probability of finding one or 
another state will be normalized to unily, we have 

= = l (120.3) 

n 7b 

when summed over all states n. For the mean energy of the members 
of the ensemble we have 


(120.4) 



S32 


EXPLANATION OP THEBMODYNAMICS Chap. XIII 

Artfl for the vahie of the quantity H, corresponding to the probabilities 
for the different energy states n, we can write 

E-2PJogF. 


e e ^ 


e 


e 


(120.5) 


Owing to the constancy of tfi and 0, the first of the above ec[uations, 
(120.3), can also be expressed in the form 

Z = (120.6) 

n 

where the useful quantity Z, which is thus defined, may be given for 
convenience the name of sum-over -states. 'f We shall find this quantity 
important for our later work. 

The unmediately foregoing expressions apply to the different in- 
dividual states n which the system of interest can exhibit. In the case 
of degeneracy, if we wish for convenience to group together individual 
states, all of which have the same energy Ej^, we can rewrite these 


Pk = Gj,e e , 

(120.7) 



TOj^e e = 1 , 

(120.8) 

iC 

_ ^-Eti 

E = '^Qjce 9 Eh, 

(120.9) 

jft 

(120.10) 

iU 

Z = e-A/® = J 

(120.11) 


In using such a canonical ensemble to represent a given system of 
interest, in a condition of thermodynamic equilibrium, it will be noted 
that the expression for the normalization of the ensemble as given by 
(120.3) or (120.8), together with the expression for the mean energy of 
the members of the ensemble as given by (120.4) or (120.9), provide 
two equations for the determination of the two distribution parameters 
tfi and 0. Hence the ensemble may be regarded as corresponding to a 
given system of interest i^a condition of equilibrium with a specified 
value for the mean energy B which is to be expected. In using such an 
ensemble for the interpretation of thermodynamic phenomena, ttia 
mean value of energy would then be correlated — as in our previous 

t This agrees with the form assmned by Planck’s so-called ‘Zustandsumine’ when 
consideration is given to the states for a system as a whole. See Planck, Theory of 
Heat, translated by Brose, London, 1932, p. 253, equation (419). 



533 


§120 KEPRESENTATION OF EQUILIBRIUM 

discussion of the first law — ^with the precise value of energy which 
would be assigned in thermodynamic considerations. 

In this connexion it is of importance to note that the canonical dis- 
tribution actually does have a character, under usual circumstances, 
such that the fluctuations in energy around the mean would be small. 
The tendency towards concentration into states of neighbouring energy 
is brought about by the combined effects of the form of the distribution 
law (120.2) with the negative of appearing in the exponent, and o 
the rapid increase with energy in the number of possible energy eigen- 
states which occurs in the ceise of systems of many degrees of freedoni. 
Taking specified values of ip and the first of the above-mentione 
factors then makes systems of relatively high energy improbable, and 
the second factor makes systems of relatively low energy improbable. 
In agreement with the remarks made at the end of the preceding 
section, it will be appreciated that this tendency for the systems in 
a representative canonical ensemble to be concentrated in neighbour- 
ing states of a similar character is very important for the statistica 
mechanical explanation of the ordinary principles of thermodynamics, 
since otherwise it would be difficult to account for the actual success 
of those principles when applied from a point of view which assigns 
precise values to such macroscopic quantities as energy. 

A more complete consideration of the fluctuations to be expected at 
thermodynamic equilibrium, in the values of energy and other quan- 
tities pertaining to a system of interest, will be given in § 141 at the 
end of the next chapter. When such fluctuations become sufficiently 
large, the methods of theraiodynamics must either be modffied or be 
supplemented by more directly statistical considerations. 


121. Relation connecting the values of S in neighbouring 
canonical ensembles 

We have seen in the preceding section that the canonical distribution, 
representing a system in thermodjmamic equilibrium, can be described 
in the energy language by the expression 


Pn-= 




( 121 . 1 ) 


giving the probabilities for finding the system in the different steady 
states n, corresponding to the energy eigenvalues It will be note 
that any particular canonical distribution is directly dependent on t e 
values of the two distribution parameters tp and 6, It will a^o be 
appreciated, however, that the distribution can be taken as indirectly 



634 


EXPLANATION OE THERMODYNAMICS Chap. XIII 


dependent on the values assigned to any external coordinates a^, 03 ,.-., 
etc., which describe the system, since a change in these — ^for example, 
a change in volume — ^would affect the eigenvalues for the energies 
of the different states n. 

In the present section we wish to consider the effect on the im- 
portant quantity M of making small variations in the parameters and 
external coordinates ijt, 6, oq, a^, 03 ,... in such a way as to secure a 
neighbouring canonical distribution suitable for representing the same 
system in a neighboming condition of thermodynamic equilibrium. As 
the expression for M, corresponding to the probabilities for the true 
energy states n, we can write in the case of a canonical ensemble 


n 


( 121 . 2 ) 


as given by our earlier equation (120.5). Hence for the variation in H, 
when we change to a neighbouring canonical ensemble, we can at 


once take 


Sif/ SE ijs — E 

UXX — /\ck 


d e 


86 . 


(121.3) 


We shall be interested, however, in putting this expression into a more 
significant form. 

Since we are making a change to a new canonical ensemble, it is 
evident that our variations must be so made as to preserve the truth 
of the relation 

26^=1, (121.4) 


which makes the total probability for finding the system in one or 
another state n equal to unity. Hence the variations which we make 
in the quantities t/f, 6, %, Uj, 03 ,... must be taken in any case in such 
a way as to satisfy 








ddn 






86 


) = o. 

( 121 . 6 ) 


It will now be convenient for any state n to define the generalized 
external forces A^, Ag, Ag,... which correspond to the external coordinates 
Oj, Og, Oj,... by the equations 


A — 


A 

" 8a, ^ 


Sag’-"- 


( 121 , 6 ) 


Introducing these expressions, and making use of the general method 



536 


§ 121 VAEIATION OF S 

of computing mean values, (121.5) can then be rewritten in the form 

^ ~ (121.7) 

where rij, denote the mean values of the external forces cal- 
culated over the members of the ensemble, and Sa^, Sog, 803,... denote 
the variations made in the external coordinates, the same, of course, 
for each member of the ensemble in accordance with our original dis- 
cussion of external parameters. 

Combming (121.7) with our previous expression (121.3) for the varia- 
tion in B, we now obtain the desired result 

— SB = — — |-^(riiS0i-l-ri2 802+-^3^®8+”')- (121.8) 

We shall consider the application of this important relation to the 
interpretation of thermodynamics in the next section. 

122. Statistical mechanical analogues of entropy, temperature, 

and free energy 

We have already proposed statistical mechanical analogues for the 
thermodynamic quantities appearing in the first law of thermodyna- 
mics. Corresponding to the energy of the system, we take the mean 
energy B of the members of a representative ensemble appropriate for 
the thermodynamic system of interest; corresponding to the heat ab- 
sorbed, we take the mean energy ^ transferred between members in 
a combined ensemble representing the system of interest and the heat 
bath; and corresponding to the work done, we take the mean work W 
performed by the members of the ensemble by the forces which they 
exert on external bodies. We have seen that these quantities satisfy 
an equation of the same form as that for the ordinary first law of 
thermodynamics, aS=^—W. (122.1) 

In the present section we wish to propose statistical mechanical 
analogues for the important thermodynamic quantities entropy, tem- 
perature, and free energy which appear in considerations of the second 
law of thermodynamics. In following sections we shall then show that 
the proposed analogues do have properties appropriate for representing 
these quantities, and fibaally, in § 130, we shall feel able to make a com- 
plete statement of the statistical mechanical form of the second law 
of thermodynamics. 

The statistical mechanical analogues of entropy and temperature are 



536 EXPLANATION OF THERMODYNAMICS Chap. XIII 

provided by comparing the form of the statistical mechanical relation, 
derived in the preceding section, 

-SS = !^+l(IiSai+Ia8a2+l38a3+...), (122.2) 

with the form of the familiar thermodynamic relation 

SS = ^-l-i(A8«i+^2S«2+^sSa8+-)- (122.3) 

The first of these eqimtions expresses the variation in the statistical 
mechanical quantity S’, corresponding to the probabilities for energy 
states, when we pass from the canonical ensemble which represents a 
system in one condition of thermodynamic equilibrium to that which 
represents it in a neighbouring condition of thermodynamic equili- 
brium ; while the second of the equations expresses the variation in the 
thermodynamic quantity, the entropy of the system 8, when we pass 
from the one condition of equilibrium to the other. 

It has already seemed appropriate to correlate the thermodynamic 
quantities energy E, external forces A^, -dj,..., and external co- 
ordinates for ^he system of interest, with the mean energy 

E, mean external forces A^, A^j A^,..., and external coordinates a^, 
ag,... for the corresponding representative ensemble, in accordance with 
the expressions __ _ 

E E y 

^ A 2 ^ A^, ^2 ^ ^2 (122.4) 

Hence the similarity in form of (122.2) and (122.3) now makes it 
reasonable to propose the further correlations of the thermodynamic 
quantities ent^y 8 and temperature T with the statistical mechanical 
quantities —M and 6, in accordance with the expressions 

S ^ -kS, T (122.6) 

tC 

where a constant k, with the dimensions of energy over temperature, 
has to be introduced to allow for difference in units. As indicated by 
the notation, this constant will actually turn out to be equal to Boltz- 
mann’s the perfect gas constant per molecule (see § 136(c)). By 
making use of the correlations (122.4) and (122.6), the thermodynamic 
relation (122.3) is indeed seen to agree with the statistical mechanical 
relation (122.2). 

Having introduced the analogues for entropy and temperature, we 
can now at once find the analogue for free energy. To do this we have 



ENTROPY, TEMPERATURE, AND FREE ENERGY 


537 


§ 122 


only to compare our previous statistical mechanical relation (120.6), 
which can be solved for the parameter ^ in the form 

^ = E+eE, ( 122 . 6 ) 

with the thermodynamic equation 


A = E-T8 (122.7) 

by which the free enei^ of a thermodynamic system was defined 
by Helmholtz.f Noting (122.6), it will then be seen that we can make 
the immediate correlation A (122 8) 

It may also be noted, in accordance with (120.6), that the free energy 
can be correlated with the sum-over-states Z by the equation 

A ^ -*riog Z = —kTlog 2 (122.9) 


This result is often very useful in making practical calculations of this 
important quantity A which characterizes a system in thermodynamic 
equilibrium. 

The above correlations, of the two essential thermodynamic quantities 
temperature T and entropy S with the statistical mechanical quantities 
6 and S, have been obtained for the special case of equilibrium by 
comparing the change in S when a thermodynamic system is changed 
from one condition of thermod 3 mamio equilibrium to another, with 
the change in E, when the corresponding representative ensemble is 
changed from one canonical distribution to another, i.e. from one state 

statistical equilibrium to another. Furthermore, the expression for 
E, used in making the above correlations, is the special one that would 
correspond to contemplated observations on true energy states carried 
out with a limiting accuracy such that coarse-grained probabilities 
and fine-grained probabilities would come into agreement. Hence 
it will now be desirable to make some further remarks as to possibilities 
that might be available also in the absence of equilibrium for the 
correlation of temperature and entropy with statistical mechanical 
quantities, and as to the role played in any such correlations by the 
type of observation that we have in mind. 

Let us first consider the concept of temperature. In a case of equili- 
brirun the science of thermodynamics ascribes a imique and precise 
value to the temperature 2^ of a system as a whole, which could be 
empirically determined by any suitable thermometer inserted into any 


t We use the letter A to designate the free energy of S 3 rstem as originally defin^ hy 
Hdrnholtz. This quantity, A = E—TS, should be carefully distinguished from the 
thermodynamic potential, F = E^TS+pv, which is often called free energy by 
chemists. 

3595.25 


3Z 



638 EXPLANATION OP THEBMODYNAMICS Chap. XIII 

part of the system considered. This corresponds to the statistical 
mechanical consideration that a system of interest, in a condition of 
equilibrium "with a specified mean energy, is to he represented by a 
canonical ensemble with a definite value for the distribution parameter 
6 which can be immediately correlated with T. 

In the absence of equilibrium the idea of the temperature of a system 
is regarded in thermodynamic treatments as losing its unique and pre- 
cise character, since a thermometer in different parts of the system or 
thermometers of different constructions would in general give different 
indications. This corresponds to the statistical mechanical considera- 
tion, that a system which is not in equilibrium woiild have to be 
represented by an ensemble which is not in equilibrium — ^rather than 
by a canonical distribution — so that no distribution parameter 0 would 
be available for purposes of correlation. Nevertheless, in so far as it 
remains valid in the absence of equilibrium to apply the concept of 
temperature in an approximate maimer to parts of the whole system 
or to partial aspects of its behaviour, it also remains possible to find 
the appropriate statistical mechanical representation of the situation. 
For example, in a case of heat conduction from one part of a system 
to another to which temperatures T can be approximately assigned, 
the conditions of these parts may be described by nearly canonical 
distributions with corresponding values of 6, and the condition of the 
system as a whole may then be represented by a combined ensemble 
constructed in the general manner which was first discussed in § 107. 

Let us now turn to the concept of entropy. In a case of equilibrium 
the science of thermodynamics ascribes a precise value to the entropy 
8 of a. system, which could be empirically determined by evaluating 
the integral J dQ/T for any reversible process (see § 130) by which the 
system could be taken from its selected standard condition of thermo- 
dynamic equihbrium to the condition of interest. This corresponds to 
the statistical mechanical consideration that we have decided to corre- 
late the entropy 5 for a system in thermodynamic equilibrium with 
the precise value of F = 2 Pn^ogP^ that would be calculated from the 
(exact) probabilities for the true energy states n in the canonical 
ensemble which we take as representing the equilibrium. 

In the absence of equihbrium the idea of the entropy of a system 
is still regarded in thermodynamic treatments as retaining at least an 
approximately valid character. This is made possible by the considera- 
tion that the entropy of a system is equal to the sum of the entropies 
of its parts, and that the system as a whole can be treated as consisting 



§ 122 ENTROPY, TEMPERATURE, AND FREE ENERGY 


539 


of parts, or of components, approximately in conditions such as could 
be reached by reversible processes from a standard condition. In 
accordance with this circumstance, we shall now regard the entropy S 
for any system as correlated in general with quantities pertaining to 
the corresponding representative statistical ensemble by 

8 ^ -kS =-k2 P„logP„, (122.10) 

n 

where an additive constant may be included if desired to allow for 
choice of standard condition, and where the quantities P,^ are the 
coarse-grained probabilities for those states n which are involved in 
the kind of observations on the system that concern us. 

In case we are concerned with an initial approximate observation on 
the condition of a system which is changing with time, the entropy 8 
would be correlated with the coarse-grained probabilities J^, which we 
determine by our initial approximate measurement of states of the kind 
n, and which we may then use in setting up a representative ensemble 
for the system with the initial density matrix = Pn^nm- 
noted, in accordance with the principles of the quantum mechanics, 
that an immediate repetition of the same kind of observation would 
lead to the same value of entropy. It will also be appreciated that the 
value of entropy thus obtained could be expected to be in consonance 
with the approximate value that would be assigned in a thermodynamic 
treatment of the system on the basis of an observation of its condition. 
In case we are then concerned with a subsequent approximate observa- 
tion on the condition of such a system, at a time later than that of the 
initial observation or of a different kind, the entropy 8 woidd be cor- 
related with the coarse-grained probabilities which would be pre- 
dicted for those states n that now interest us, with the help of the 
representative ensemble that has been set up. Finally, in case we are 
concerned with the precise observation which might be appropriately 
made on the true energy state of a system in a condition of equilibrium, 
the entropy 8 would be correlated with the coarse-grained probabilities 

which would be predicted for such energy states n, from the ensemble 
representing the equilibrium, at the limit of exact measurement where 
coarse- and fine-grained probabilities come into agreement. 

At flbrat sight the circumstance, that the entropy of a system can 
depend on the kind of observation that is contemplated for perform- 
ance, might seem to stand in contradiction to the appearance of that 
quantity in definite thermodynamic expressions. This proves not to 
be the case, however, since the more general statements of thermo- 



640 


EXPLANATION OF THERMODYNAMICS Chap. XIII 


dynamics are concerned primarily with inequalities rather than equa- 
tions which are to be satisfied by the entropy S. It is also to be 
remarked in this connexion that we shall find it generally appropriate, 
in the development of thermodynamic analogies which follows, to 
regard ourselves in the case of a system which is changing with time 
as concerned with approximate observations on suitable unperturbed 
energy eigenstates which are nearly steady states for the system, and 
in the case of a system which is in equilibrium as concerned with precise 
observations on true energy states. Hence, in the absence of e:^licit 
mention, it is to be assumed in what follows that the values of B and 
S refer to observations on nearly steady or quite steady states of the 
system of interest. 

We may now sum up the results of the present section by writing down 
once more the three correlations 

S^-kS, T^L (122.H) 

ic 

which, we take as representing the statistical mechanical interpretation 
of the thermodynamic quantities entropy; temperature, and free energy. 
These proposals may be regarded, if desired, as tentative for the time 
being. The nert few sections will_be devoted to showing that the 
statistical mechanical quantities — S and 9, and thus also by implica- 
tion }/i, do behave in such a way as to justify the above correlation. 

123. Effect on E of leaving a system in essential isolation 

The first consideration, justifying the above proposals for the statisti- 
cal mechanical interpretation of thermodynamic quantities, will be 
based on the behaviour that would be expected in the course of time 
for an ensemble representing a system left to itself in essential isolation, 
the walls of its container and other surroundings with which it neces- 
sarily interacts being in such a' condition that in the mean no net 
transfer of energy to or from the system proper takes place. Such a 
set-up would appear to provide the appropriate statistical mechanical 
representation for a system which would be regarded in thermodjma- 
mics as left to itself in complete isolation. 

In accordance with the discussion of essential isolation given in 
§ 111 (d), the ensemble representing the above situation could be ex- 
pected to change, in the course of a time long enough for extensive 
interaction with the surroundings, to a distribution of probabilities 
for the different nearly steady states that concern us which would 
be practically a canonical distribution. Hence, in accordance with a 



EFFECT OF ESSENTIAL ISOLATION 


541 


property of the canonical ensemble shown in §112(6), the ultimate 
distribution could be tahen as closely corresponding to a minimum 
value for the quantity M imder the condition of constant mean energy 
B for the system proper. 

Thus the behaviour in time of an essentially isolated system can be 
described in the language_of statistical mechanics as a tendency for the 
corresponding quantity M to change to the ni^iuimnTri value compatible 
■with a constant value for the mean energy W. On the other hand, the 
behaviour of an isolated system can be described in the language of 
thermodynamics by the well-known principle that the entropy 8 for 
the system would tend to increase "with time to the maximum value 
compatible 'with its constant energy E. This thus pro'vi^s further 
justification for our proposed correlation between 8 and — S. 

In this connexion it -will be remembered that the derivation which 
we actually gave for the fi-theorem was dependent on the introduction 
of the hypotheses of equal a priori probabilities and of random a priori 
phases in setting up the initial representative ensemble for the system 
of interest, and that we had to allow for the possibility of occasional 
de'viations from our conclusions when these assumptions were no^near 
enough valid. Accepting the proposed correlation of 8 ■with —H, we 
must now regard the certainty ■with which 8 does increase in a system 
appreciably displaced from equilibrium as due to the in&equent occur- 
rence of important de'riations from the consequences of those hypo- 
theses in the case of actual thermodynamic systems of many degrees 
of fireedom. 

124. Effect on E of adiabatic changes in external coordinates 

We shall next wish to obtain a second illustration of the propriety 
of correlating 8 ■with —E by consideoring the effect on a canonical 
ensemble of making an acktal change in the values of the external 
coordinates for the systems in the ensemble. Here we shall distinguish 
the two limiting cases of an extremely slow variation in the values of 
the coordinates and of an abrupt change in their values. We shall thus 
obtain analogies for the two thermodynamic cases of reversible and 
irreversible changes in external coordinates. The treatments to be given 
■wiU be based on our pre'rious investigation of the effect of such changes 
on the probabilities for the different states of a single quantum 
mechanical system as carried out in § 97. 

Let us consider an isolated thermodynamic system in a condition of 
thermodynamic equilibrium together ■with the corresponding canonical 



542 


EXPLANATION OF THERMODYNAMICS Chap. XIII 


ensemble of members which would represent it in that condition, in 
accordance with the concept of essential isolation. In the first place, 
let us then make a small charge 8a iu the value of some external 
coordinate a, pertaining to the system of interest and to the members 
of the corresponding ensemble, the change being carried out adiabatic- 
ally from the thermodynamic point of view and hence appropriately 
taken as essentially adiabatic fi:om the mechanical point of view, in 
accordance with the discussion of § 111 (d). In the second place, let us 
then allow sufficient time so that the system of interest comes to its 
new condition of thermodynamic equilibrium and the ensemble comes 
to the corresponding new canonical distribution, which, as we have 
seen, would be established in the course of sufficient time under condi- 
tions of essential isolation. 

We must now give separate attention to the two limiting cases when 
the change 8a is made very slowly and when it is made quite abruptly. 
We may begin with the case of a very slow change. 

(a) Reversible adiabatic change in an external coordinate. Since the 
total change in the ensemble is firom one given canonical distribution 
to a neighbouring canonical distribution, we can in any_case apply our 
general expression (121.8) connecting the values of if in two such 
ensembles. This gives us, correct to the first order in small quantities, 

-8S==^+isa, (124.1) 


where SE is the change in the mean energy of the systems in. the 
ensemble, 3 . is the mean external force corresponding to the coordinate 
a, and 6 is the parameter defining the original canonical distribution. 

To evaluate this expression, in our present case, let us consider the 
change in mean energy SE which is to be expected when the change in 
the coordinate a is made very slowly. Since we have shown, by equa- 
tion (97.29) of §97 (6), that the probabilities for finding a single 
system in its different energy states n would not be altered by a vanish- 
ingly slow rate of change in the coordinate a, we can also conclude that 


the probabilities. 


P„ = 


(124.2) 


for finding different states n in the ensemble would not be altered 
merely as a direct consequence of the change 8a. Furthermore, in 
accordance with the essentially adiabatic character of the change in 
which we are interested, we can conclude that the mean energy of the 
ensemble would not be affected by the subsequent readjustment in 
probabilities to a new canonical distribution which we assume to take 



§ 124 EFFECT OF BEVEBSIBLE ADIABATIC CHANGE 643 

place in the course of time by interaction mth the surroundings. Hence, 
for the total change in the mean energy E, we need only consider that 
which results from the change in eigenvalues when we change the 
external coordinate a. This then permits us to write the simple ex- 
pression _ flP 

SM = -A Sa, (124.3) 

n 

where we equate the mean value of dEJda to the negative of the mean 
generalized force A in accordance with the definition of that quantity 
by (121.6). 

Combining (124.1) and (124.3), we are then led in the present case 
to the result _ q (124.4) 

In accordance with this ^portant result there would be no resulting 
change in the value of ^ if we make small adiabatic changes at a 
vanishingly slow rate in the external coordinates a for the systems in 
a canonical ensemble and allow the re-establishment of the correspond- 
ing new canonical distribution. Furthermore, by making a succession 
of such changes, allowing establishment of equilibrium at each stage, 
it is evident that we could then abo make a finite change in external 
coordinates without alteration in H. And by reversing the change in 
external coordinates it is evident that we could return the ensemble of 
systems, including the external bodies acted on by forces, to their 
original cpndition. Hence, making use of our proposed correlations of 
a thermodynamic system in equilibrium with a canonical ensemble 
of systems, and of entropy S with the negative of H, we are thus 
provided with satisfactory statistical mechanical analogies for the con- 
stancy of eniropy and accompanying reversibility of the process, when 
the coordinates for a thermodynamic system are adiabatically changed 
at a rate slow enough to preserve at all times a condition of thermo- 
dynamic equilibrium. 

In connexion with the demonstration of (124.4), it should be noted 
that it is the unchanging values, assignable to the probabilities for 
different states n when the change ha is made slowly enough, rather 
than the subsequent readjustment to a new canonical ^tribution, 
which makes it possible to oonoli^e that the value of hH would be 
zero. Indeed, since the' quantity H may be regarded as defined in the 
present connexion by ^ ^ P^logP^, (124.6) 

n 

we see that imchanged values for the probabilities when the change 
ha is made would leave B strictly unaltered, and that the subsequent 



644 EXPLANATION OF THERMODYNAMICS Chap. XIII 

readjustment to a new canonical distribution, corresponding to the new 
valim for the mean energy B, would actually l^d to a slight decrease 
in M. Nevertheless, this subsequent change in M would only be of^the 
s^ond order, since the expression (124.1), for the dependence of hM on 
hE and 8a, which forms the basis for the above treatment, is itself 
correct to the first order. It will thus be seen that the analogy which 
we have given for reversible, adiabatic, thermodynamic processes de- 
pends essentially on treating the rate of change in external coordinates, 
in the first place, as sufficiently slow so that the probabilities for 
different states n wiU not be altered as a pmely mechanical consequence 
of the change, and in the second place, as still further sufficiently slow 
so that the readjustment to an appropriate canonical distribution can 
be regarded as already completed at each new step of the process. This 
may be regarded as a statistical mechanical formulation of the thermo- 
dynamic idea that reversible processes must be carried out sufficiently 
slowly so as to preserve mechanical and thermodynamic equilibrium at 
each stage of the process. 

In order to complete this discussion of the choice of a very slow, 
essentially adiabatic, statistical mechanical process as providing the 
appropriate analogy for a reversible, adiabatic, thermodynamic process, 
some remarks must be made as to the role which is thereby assigned — 
in accordance with the discussions of §§ 111 and 112 — ^to the walls or 
other immediate surrormdings of the system proper in readjusting the 
representative ensemble to a new canonical distribution after each 
change 8a in an external coordinate. In this respect the treatment is 
different from that of Gibbs, who regarded it as desirable to introduce 
the definite, abstract idealization that the walls of the container for 
a system proper should have no effect on its behaviour.f The justi- 
fication for the procedure, which we have adopted at this point, has 
a twofold character. 

In the first place, the concept of essential isolation provides an 
idealization, which appears to correspond somewhat more closely to the 
actual situations of ordinary interest in thermodynamic studies than 
would be provided by the alternative concept of perfect isolation. The 
assumption, in the case of adiabatic processes, that a thermodynamic 
fluid would merely have no net heat flow from the surroundings on 
the average in a series of similar experiments, would be a less drastic 
abstraction away from reality than the alternative assumption that the 

t Gibbs, Elementary Principles of Statistical Mechanics, Yale TJniversity Press, 1902, 
p. 164. 



§ 124 EFFECT OF REVERSIBLE ADIABATIC CHANGE 545 

fluid should have absolutely no interchange of heat, for example, 
with the walls of a cylinder enclosing it or with a piston exerting 
mechanical action on it. Furthermore, the assumption that the extent 
of interaction between representatives of the system proper and of the 
surroundings would be sufl&cient for a substantial maintenance of 
canonical distribution seems reasonable in view of the circumstance 
that we regard ourselves as starting out with an exact canonical dis- 
tribution, as then making only an infinitesimal change Sa in some 
external coordinate a, and as then waiting a very long time before a 
further change is made in that coordinate. The initial stage of a finite 
change in the coordinate would then in any case be characterized by 
the necessity for only an infinitesimal readjustment of energies to secure 
canonical redistribution and by plenty of time for this to occur through 
interchange with the surroundings, and any later stage would also be 
so characterized provided we neglect the circumstance that previous 
readjustments could be subject to small and unknown deviations from 
strict canonical form. 

In the second place, the concept of essential isolation provides an 
idealization which is more appropriately related to our general pro- 
gramme for the statistical mechanical explanation of thermodynamics 
than would be provided by the alternative concept of perfect isolation. 
It is characteristic of this programme that we correlate the definite 
values assigned to mechanical quantities on the basis of thermodyna- 
mics with the mean values which they have in a suitable representative 
ensemble. Hence, having correlated the energy jE? of a thermodynamic 
system with the mean energy J in the corresponding ensemble, it would 
then appear appropriate to take a thermodynamic, adiabatic process 
as one in which this mean energy was not affected by thermal flow 
rather than as one in which absolutely no readjustment in individual 
energies could take place. Moreover, having initially introduced canoni- 
cal distribution as the suitable representation for the condition of 
thermodynamic equilibrium, it would then seem appropriate and de- 
sirable to regard that same kind of distribution as representing the 
successive conditions of thermodynamic equilibrium through which a 
system would pass in a finite, reversible, adiabatic change. This could 
not be achieved using perfect isolation as an idealization, since it is 
only for a very limited class of mechanical systems (e.g. an harmonic 
oscillator with slowly varied frequency) that canonical distribution 
would be retained on the slow variation of an external parameter, 
unless we do introduce the needed redistribution of energies through 



640 EXPLANATION OE THERMODYNAMICS Chap. XIII 

interchange with the surroundings. It is to be noted in this connexion 
that we have already decided to correlate the temperature jP of a 
thermodynamic system at equilibrium with the distribution parameter 
6 for the corresponding canonical ensemble, and hence, if we did not 
provide for the ^-establishment of canonical distribution at each stage 
of our analogue, we should have nothing to correspond to the con- 
tinuous change in temperature which in thermodynamic treatments is 
always tfl,Trfln as accompanying reversible adiabatic changes. 

The re-establishment of canonical distribution at each stage of a 
sufficiently slow, essentially adiabatic, process assumes especial im- 
portance in connexion with our later investigations of the Carnot cycle 
and of the formulation of the second law in general, which are to be 
carried out in §§ 129 and 130. In those sections we shall derive two 
relations, — a a 

V < 5 .^- 

and —AH > J 

which are to be regarded as the analogues of the well-known thermo- 
d 3 mamic relations m m 

and > J 

which respectively characterize the Carnot cycle and give a general 
formulation of the second law. The proofe which we shall give for these 
relations foUow methods which were devised by Gibbs, and the proofe 
would still be valid of we used the idealization of perfect rather than 
essential isolation in representing adiabatic processes. Nevertheless, the 
validity of the above relations would then be restricted in general to 
irreversible processes covered by the signs of inequality, since the 
limitmg case of reversible processes covered by the signs of equality 
could only be approached if the condition of thermodynamic equili- 
brium could be regarded at each stage of the change as correlated with 
a canonical distribution, as can be secured by employing the idealiza- 
tion of essential rather than perfect isolation. Without the continuous 
readjustment to canonical distribution made possible by the newer 
idealization, we should in general encounter unavoidable irreversible 
adjustments in distribution when ensembles are employed to represent 
the combined reversible processes of making a very slow adiabatic 



§ 124 EFFECT OF IREEVERSIBLE ADIABATIC CHANGE 647 

change in a system of interest, followed by an immersion of the system 
in a heat bath at a temperature so chosen that no thermal flow takes 
place. 

In view of the foregoing discussion it then appears reasonable to 
adopt our proposed modification of the methods of Gibbs and make 
use of the concept of essential rather than of perfect isolation in 
obtaining a suitable idealization for adiabatic thermodynamic processes. 

(6) Irreversible adiabatic change in an external coordinate. Let us 
now turn to the case of an abrupt change Sa in an external coordinate 
fit, followed by the establishment of a new canonical distaibution. Under 
these circumstances our previous general equation 

-8f = ^^+|8fit (124.6) 

would still be valid for the change in E, when the new distribution has 
established itself, but the change in mean energy hE could no longer 
be calculated firom (124.3), since an abrupt change in a would in general 
change the probabflities for different states n. To treat the present 
case it is simplest to begin at once by giving separate consideration to 
the two successive parts of the process: abrupt change in a — establish- 
ment of the new distribution. 

In accordance with the treatment of abrupt changes given in § 97 (c), 
the probability amplitudes for the different energy states of a single 
system, before and after a change Za, would be coimected in agreement 
with (97.34) by the relation 

G„(a„-fSa) = |4^C5,(flo), (124.7) 

where the quantities 

= J <K+8o)«*(Oo) ^ (124.8) 

are the elements of a unitary transformation matrix. Hence, if we now 
consider an ensemble of such systems, we can take the probabilities 
before and after the change 8a as connected by the relation 


IGJao-hSa)!* = 1 (124.9) 

where the double bar indicates, as usual, a mean taken over all the 
members of the ensemble. iFurthermore, making use of the diagonal 
character of the original canonical distribution, this can be rewritten 

in the form p,,{a,+Ba) = J (124.10) 



548 


EXPLANATION OP THEBMODYNAMICS Chap. XIII 


This last formula, however, has just the same form as the relation 
Pnn{i)=l\Unlc\%k{0) (124.11) 

k 

on which we based the derivation of the fl'-theorem itself in § 106, and 
the have been shown in § 97 (c) to have the same unitary properties 
as the which were needed in that derivation. Hence we can ta^ 
the abrupt change in the coordinate as not leading to an increase in H, 
and in general as leading to a decrease in that quantity. 

Turning, then, to the farther effects that are to be expected from 
the subsequent establishment of a new canonical distribution under the 
conditions of essential isolation, it will be evident that they can also 
only lead to a decrease in S for the system proper, sinee_that distribu- 
tion is characterized by a minimum of the quantity H with a fixed 
value for the mean energy S. 

Hence the succession of the two processes of making an abrupt 
change Sa in some external coordinate a for the members of a canonical 
ensemble, and then allowing the establishment of a new canonical dis- 
tribution under essentially adiabatic conditions, would lead to a small 
change in E which can be characterized by the relation 

85 < 0, (124.12) 

where the equality sign could only be expected as an exception, or if 
the change were actually made slowly instead of abruptly. 

The above is of immediate interest in providing the statistical 
mechanical analogy for the thermodynamic principle, that the entropy 
of a thermodynamic system originally in equilibrium can only increase 
as the result of a.brupt, adiabatic changes in external coordinates and 
of the subsequent re-establishment of equilibrium. Furthermore, if we 
regard any change in external coordinates at & finite rate as analysable 
into a succession of abrupt changes, we are also provided with a statisti- 
cal mechanical analogy for the thermodynamic phenomenon of the 
irreveraibUiiy of changes made in external coordinates at a finite 
rate. 

It is also now of interest to return to ;&e general formula (124.6) 
which holds in any case for the change in S when we change from one 
canonical distribution to a neighbouring one. If we apply this formula 
to the case of an extremely slow and hence reversible change 8a, we 
can write _ = = 

-8l^, = ^--f-4sa=0. 


(124.13) 



§124 


EFFECT OF INTERACTION IN GENERAL 


549 


If, however, we apply it to the same change So made abruptly, we 
can write cj s j 

-SH^ = ^+±da > 0. (124.14) 

Combining the two, we obtain 

8fj„ ^ Sf^. (124.15) 

This provides the statistical mechanical analogy for the thermod 3 aiamio 
principle that a sudden change in the external coordinates for an 
isolated system in equilibrium will in general require more work (or 
deliver less work) than a gradual change by the same amount. 

125. Effect on S of interaction in general 
We may now continue our discussion of statistical mechanical ^a- 
logies for thermodynamic processes by considering the effect on if 
an ensemble representing a given thermod 3 niamio system of interest is 
combined with a second ensemble representing another such system in 
a manner to correspond to thermal or mechanical interaction between 
the two systems of interest. The result needed for the discussion has 
already been provided by our application of the ff-theorem to two 
interacting systems and as carried out in § 107. 

At an initial time t=0, let us regard ourselves as making observa- 
tions on the conditions of the two system^and as setting up the corre- 
sponding representative ensembles, with S^(0) and ^ initial 
values of S for the two distributions. Eurthermore, at this same time 
let us regard ourselves as placing the two systems in interaction with 
each other, if this has not already been done, and as representing the 
combined system 8^^ by a combined ensemble, which can be constructed 
by regarding each member of the ensemble for/^ as combined separately 
with each member of the ensemble for S^, thus giving a new ensemble 
of higher order in the number of its members. 

Let us now regard ourselves as leaving the two systems in interaction 
with each other until some later time i = t, at which time renewed 
observations might be made on their conditions and the two systems 
again separated if desired. In accordance with the application of the 
H-theorem to the ensemble representing the combined system, as 
carried out in § 107, we may then write (107.16), 

Si(<)+%) < fi(0)-Fl2(0), _ (12^.1) 

as an expression connecting the sum of the quantities S^{t) and S^{t) 
for the two systems at the later time t = t with the smn of those 
quantities at the initial time t = 0. 



550 


EXPLANATION OP THERMODYNAMICS Chap. XIII 


In the interests of a simplified notation for use ia following sections, 
it will now he convenient to rewrite the above expression in the form 

Si+S^ < S^+S„ ( 125 . 2 ) 

where we use unprimed letters to denote the initial values of quantities 
and primed letters to denote later values. 

^turning again to our correlation of entropy S with the negative 
of S, we axe thus furnished with a satisfactory statistical mechanical 
analogue for the thermodynamic principle, that the sum of the en- 
tropies of two systems cannot be decreased as a result of allowing a 
period of interaction between them. 


126. Lemma on S+EJ9 

Before proceeding farther with our statistical mechanical analogies 
for thermodynamfo processes, we must first derive a theorem concerning 
the value of E+JS/d in a canonical ensemble. Let us consider a given 
canonical ensemble of systems distributed over energy states n accord- 
ing to the expression 

= e 9 . (126.1) 

And let us also consider any other distribution for the same systems 
over energy states n which could, of course, necessarily be described 
by an expression of the form 

p; = e 9 (126.2) 

provided we allow the quantities to assume the^ needed values. We 
now wish to compare the value of the quantity H+E[d a^calcidated 
for the original canonical distribution with the value of H'+E'/O as 
calculated for the second distribution using therein the value of 6 for 
the original canonical ensemble. 

To make the calculation we c^ write, as a consequence of the signi- 
ficance of the quantities S and E, 

(fl'+f)-(S+|) = I i>4(logi.;+f ) - I P,(logP,+f ) 


n ' ' ' n ' 


d 


n 


n 

= 2e 




i-Sn 




n n 




( 126 . 3 ) 



§ 126 


551 


LEMMA ON il+Bie 

where the third and fourth forms of writing make use of the considera- 
tion that the total summation over all values of n for the probabilities 
expressed by (126.1) and (126.2) must lead to the value unity. 

The quantity in the final square brackets, however, is of the form 

Q = e^a;+l-e®, (126.4) 

with the derivative with respect to x 

S - (126.5) 

It is then seen to be a decreasing function of x when x is negative, to 
have the value zero when x = 0, and to be an increasing function of 
X for z positive. It thus satisfies the condition 

Q>0 (126.6) 

with the value zero only when 

a: = 0. (126.7) 

Hence, since all the quantities involved in the summation (126.3) must 
be equal to or greater than zero, we now obtain the relation between 
the two quantities _ _ 

(126.8) 

In accordance with this expression the quantity S+E/O, where d is 
some constant, has a minimum value when the ensemble is canonically 
distributed with 6 as the parameter of the distribution, and this is the 
lemma which we desired to prove, 

127. The direction of thermal flow as dependent on 6 
So far our discussion of analogies for thermodynamic processes has 
been confined to lustrations of the propriety of regarding the negative 
of the quantity S as the analogue of entropy 8. We now turn to our 
first illustration of the propriety of regarding the parameter 0 in a 
canonical ensemble as the analogue of the temperature T in the corre- 
sponding thermodynamic system. We shall be able to show that the 
direction of energy flow between two canonical ensembles which are 
put in thermal contact is determined by the relative values of their 
distribution parameters 6, just as energy flow between bodies put in 
contact is determined by the relative values of their temperatures T. 

Let us consider two separate canonical ensembles, representing two 
different systems each in a condition of thermodynamic equilibrium; 
and let us consider the result of making contact for a time between 



552 


EXPLANATION OF THERMODYNAMICS Chap. XIII 


meinbers of the two ensembles in such a way as to secure an analogy 
for thermal interaction. Before the contact is made let us use 

fi, 01 and fg, 6^ (127.1) 

to denote the quantities indicated for the two ensembles respectively. 
And after the contact is broken let us use 

fi and f i (127.2) 

for the quantities indicated, no parameter 6 now being included since 
in general neither of the systems would now be in equilibrium and 
neither of the ensembles would be canonically distributed. In accord- 
ance with the general result as to the effect of interaction between 
ensembles as shown by expression (126.2) in § 126, we can write 

a' < B^+E^. (127.3) 

And, in accordance with the theorem of the preceding section as 
expressed by (126.8), we can write 




(127.4) 


(127.6) 


where 6-y and 6^ are the parameters for the two original canonical 
ensembles. Adding aU three expressions (127.3-6) together, we then 


^ ^ < -^1 [ -^2 
01 02 ^ ^2 
■^1— -^1 I / 


(127.6) 


Let us now consider the case of pure thermal interaction where the 
work done in maiking and breaking the thermal contact between the 
two ensembles is negligible, and the members of the two ensembles that 
have been combined do no work on each other during the interaction. 
In accordance with the statistical mechanical analogue of the first law 
of thermodynamics, § 119, we shall then have 

fi+f a-fi-fa = 0, , (127.7) 

and can take A^^i = (127.8) 

as representing the heat flow into the first system from the second 
during the period of contact. Substituting in (127.6), we thus have 

V— > 0. (127.9) 



§ 127 


DIRECTION OP THERMAL PLOW 


553 


This result shows that the direction of energy transfer can only be 
from the ensemble originally having the higher value of the distribution 
parameter 6 into the ensemble having the lower value of d. Epetuming 
to our proposed correlation of the distribution parameter d with tem- 
perature T, we are thus provided with a satisfactory analogy for the 
thermodynamic principle that heat can only flow from a body of higher 
temperature to one of lower temperature when thermal contact is made. 

128. Effects of various kinds of thermal process 
{a) Effect on B of thermal interaction in general. We may now con- 
sider the effects to be expected in a number of different cases where 
ensembles are combined in such a way as to represent thermal flow. 
We shall thus obtain further illustrations of the ^opriety of correlating 
entropy S with the negative of the quantity B, and temperature T 
with the parameter 6. _ 

We may first consider the effect on B of any kind of thermal inter- 
action between ensembles which may represent thermodynamic systems 
in any condition of interest, whether one of equilibrium or not. To treat 
the problem we have only to note that a purely thermal interaction 
between ensembles is merely a special case of interaction in general, 
when the work done in making and breaking the thermal contact is 
negligible, and the members of the two ensembles that have been com- 
bined do no work on each other. Hence we can at once apply expression 
(126.2) derived in the general treatment of interaction. This gives us 

Bi+B^ < B^+B^ (128.1) 

which says that the sum of the quantities B for the two systems after 
the thermal transfer has been completed camiot be greater than the 
original sum of those quantities. This is in immediate analogy with 
the thermodynamic principle that the sum of the entropies of two 
bodies cannot b^decreased by thermal interaction. 

(6) Effect on H of thermal transfer from a canonical ensemble. We 
now consider the case in which an ensemble originally arbitrarily distri- 
buted, representing some system of interest in an arbitrary condition, 
is placed in thermal contact with an ensemble which is itself canonically 
distributed, representing a body of specified temperature. IJsing 
imprimed and primed letters to ref®c respectively to conditions before 
and after thermal contact, we may now write, in accordance with our 
general expression (128.1), 

Bi+B^ < B^+B,, 

4B 


3696.26 


(128.2) 



554 


EXPLANATION OF THEBMODYNAMICS Chap. XIII 


for the sums of the values of H for the two ensembles before and after 
thermal contact, and, in accordance with the lemma of § 126, we may 


write 


E, 






e. 


(128.3) 


as an expression appl 5 dng to the ensemble which is originally canoni- 
cally distributed with the distribution parameter 6 ^. Adding these two 
expressions, we can then obtain _ _ 

S ' ^ ^ — "^2 


or, by the energy principle, since the work done is taken as negligible, 

(128.4) 

where {E[—Ej) is the mean energy thermally transferred to the mem- 
bers of the first ensemble. 

This provides us with a satisfactory analogy for the thermodynamic 
principle that the increase in entropy of a system, as a result of making 
thermal contact with a body in an equilibrium condition, cannot be 
less than the heat absorbed divided by the original temperature of the 
body supplying the heat, whatever the initial condition of the system 
of interest. 

(c) Thermal equilibrium as a result of successive contacts. We may 
also use the expression which we have just derived to investigate the 
effect of successive thermal contacts between an ensemble having an 
arbitrary initial distribution and ensembles which are canonically dis- 
tributed with the parameter Tor this purpose let us write (128.4) in 
the form — W' — W 

^ (128.5) 

This shows that the quantity H+Ej 6 ^ for the first ensemble can in 
general be expected to decrease as a result of making contacts with 
ensembles canonically distributed with the parameter 6 ^. We have 
shown, however, in accordance with the lemma derived in § 126, ex- 
pressed by (126.8), that the quantity H+Wjd^, where 6 ^ is some con- 
stant, would have a minimum value for an ensemble which is itself 
canonically distributed with the quantity 6 ^ as the parameter of dis- 
tribution. Hence we may conclude that any original ensemble would 
be given a canonical distribution with the parameter 6 ^, as a result 
of successive thermal contacts with ensembles which axe themselves 
canonically distributed with 62 as the parameter. 



EFFECTS OF VARIOUS THERMAL PROCESSES 


555 


§ 128 


We are thus provided with a satisfactory analogy for the thermo- 
d 3 maimc principle that any system can be brought into a condition of 
thermodynamic equilibrium at the temperature r as a result of suc- 
cessive contacts with bodies which are themselves in equilibrium at 
that temperature. 

The foregoing is very interesting in showing that we can regard the 
canonical distribution as not only appropriate for representing an 
isolated thermodynamic system which has come to equilibrium by itself 
(see § 120), but as also appropriate for representing a thermodynamic 
system which has been brought into equilibrium at a prescribed tem- 
perature T by contact with a large heat reservoir at that temperature. 
This latter was one of the most important of the arguments presented 
by Gibbs justifying the use of the canonical ensemble as a representa- 
tion of thermodynamic equilibrium. 

{d) The limiting case of reversible thermal transfer. The results of 
thermal transfer between ensembles of sy^ems are in general irre- 
versible in that the sum of the quantities H for the two ensembles is 
found to be decreased in accordance with (128.1) as a result of their 
interaction. The limiting case of reversible transfer is of special 
interest. 

To study this, let us consider two representative ensembles which are 
originally each of them canonically distributed, with the parameters 
and 02- Let them then be combined in a manner to represent thermal 
contact, and separated again after there has been on the average only 
an infinitesimal transfer of energy. Finally, let each of them be allowed 
to assume its new canonical distribution as a consequence of the merely 
essential isolation which we have found it appropriate to assume. In 
accordance with our general relation (121.8), connecting neighbouring 
canonical j^stributions, we can then write for the changes in the 
quantity M for the two ensembles 


and 


-SS, = B. 


( 128 . 6 ) 


where the assumed absence of external work appropriate in a case of 
pure thermal interaction permits us to omit the terms in (121.8) de- 
pending on external forces. T^e absence of external work will also 
permit us to take as the mean energy transferred by 

thermal action from the second ensemble to the first. Hence, adding 



666 EXPLANATION OF THERMODYNAMICS Chap. XIII 

tlie_Wo above equations, we can now write for the sum of the changes 
in B. for the two ensembles 

-m+Si, = (128.7) 

This result shows us that the change in the sum of the values of H 
for the two ensembles for a given mean energy transfer would approach 
zero as the distribution parameters for the two ensembles approach each 
other. Furthermore, by taking infinitesimally larger or smaller than 
01 , we could approach the limiting possibility of thermal tmnsfer in 
either direction without change in the combined values of B for the 
two ensembles. Equation (128.7) hence provides us with an analogy for 
the thermodynamic principle of constant total entropy at the limit 
when heat is transferred between two bodies having the same tempera- 
ture. We are thus led to an analogy for the reversibility of heat transfer 
under those limiting conditions, as compared with the irreversibility of 
heat transfer to be expected under more general circumstances. 

129. Carnot cycle of processes 

As a final example of processes carried out on ensembles which are 
analogous to typical thermodynamic processes, we may treat the case 
of the Carnot cycle. To do this let us consider an ensemble of systems 
No. 0 which we can taJre as representing ‘the engine’, and further 
ensembles Nos. 1, 2, 3,... which we can take as representing ‘heat 
reservoirs’. Let us start with the ensembles for the engine and heat 
reservoirs each canonically distributed with the distribution parameters 
. Let us then carry out a cycle of processes, involving 
external work consequent on the changes made in the external co- 
ordinates for the ensemble corresponding to the engine and involving 
thermal transfer between the ensembles for the engin e and the heat 
reservoirs. Finally, at the end of the cycle, let the ensemble for the 
engine have the same values of external coordinates and the same mean 
energy Eq as at the start, and let it once more assume its original 
canonical distribution. 

In accordance with our general expressions (124.12) and (126.2) for 
the effects of changing the external coordinates in an ensemble, and 
of introducing interactions with other ensembles, we can write 

B'^+Bi+Bi+... < (129.1) 

and, in accordance with the lemma expressed by (126.8) as to the 



§ 129 


CARNOT CYCLE 


657 


minimu m oi H-\-Ejd for a canonical distribution, we can write 

H -A < 

"o “o 

3.+S<S;+:S, (129.2) 

S I -^2 <- 

^z+T ^ 

t '2 ^2 


where the unprimed and primed letters refer respectively to conditions 
at the beginning and end of the cycle. 

Adding the above expressions, we obtain 




do ' 01^02 


00 ' 01 02 


(129.3) 


By hypothesis however, we are interested in the case where the final 
mean energy Eg of tlm ensemble corresponding to the engine is equal 
to the original value Eg. Hence we can rewrite the above in the form 


E^-Ej 


-^ 2 — -^2 


01 


<0, 


(129.4) 


which is arranged to show the dependence on the mean energy trans- 
ferred from the ensembles representing the heat reservoirs. In addition 
we can write, in accordance with the energy principle as applied to 
ensembles, ^ ^ fi-fi+fa-f (129.5) 

where W is the mean external work done by the ensemble representing 
the engine. 

For the case of a simple Oamot cycle, involving only two heat reser- 
voirs, we can combine (129.4) and (129.5) in the form 

37 

or, denoting by Qi = E^—B[ the mean thermal transfer of energy firom 
the ensemble originally distributed with the parameter dy, this can be 
expressed in the form _ — q _q 


<Qi- 


01 


(129.6) 


We thus obtain a satisfactory analogy for the fundamental thermo- 
dynamio relation of Carnot, which connects the work done by a heat 
engine with the heat absorbed at the higher temperature, and the 
values of the two temperatures at which heat is taken in and given out. 



568 EXPLANATION OP THERMODYNAMICS Chap. XIII 

The equality sign in (129.6) would be valid for the limiting case of 
a cycle of reversible steps, with heat reservoirs large compared with the 
energy that they must absorb or receive. To carry out such a reversible 
cycle the external coordinates for the ensemble representing the engine 
would have to be changed at a vanishingly slow rate, and the ensemble 
would in any step have to be canonically distributed with sensibly 
the same value for 6 as that for any heat reservoir with which it is 
in contact. In accordance with the results of §§ 124(a) and 128 (d), we 
should then have the equality sign holding in the first of the expressions 
(129.1) on which we have based our deduction. Furthermore, we can 
in any case use the equality sign in the first of the expressions given 
by (129.2), since the ensemble for the engine returns by hypothesis to 
its original canonical distribution. And, with sufficiently large heat 
reservoirs so that their mean energy is changed only infinitesimally, we 
could also use the equality sign in the remaining expressions (129.2), 
in accordance with the substantial maintenance of canonical distribu- 
tion. Thus, under these limiting circumstances, we should have the 
equality sign holding in all of the expressions on which the deduction 
has been based, and hence would also have the equality sign in the 
final expression (129.6) for the work done. 

130. The analogue of the second law of thermodynamics 

In preceding sections we have discussed the statistical mechanical 
analogues for a considerable number of thermodynamic processes that 
involve the second law of thermodynamics. These processes included 
examples of the reversible and irreversible performance of external 
work and of the reversible and irreversible transfer of heat, and the 
situations treated seemed typical of simple thermod 3 mamio processes 
in general. The treatments all showed the satisfactory character of our 
proposals to correlate the entropy 5 of a thermodynamic system with 
the negative of the quantity H for the corresponding representative 
ensemble of systems, and to correlate the temperature T of a thermo- 
dynamic system in an equilibrium condition with the distribution para- 
meter B for the corresponding canonical ensemble. We are now in a 
position to make a somewhat complete statement of the statistical 
mechanical analogue of the second law of thermodynamics. 

For our purposes we may regard the content of the ordinary second 
law of thermodynamics as given by the expression 


(130.1) 



ANALOGUE OB' SECOND LAW 


559 


§ 130 

which states that the increase in the entropy of a system, when it 
changes from one condition to another, cannot be less than the integral 
of the heat absorbed divided for each increment of heat by the tem- 
perature of a heat reservoir appropriate for supplying the increment 
in question. The equahty sign in this expression is to he taken as 
applying to the limiting case of reversible changes. 

The justification for the statistical mechanical analogue for this 
expression is immediately provided by relations which we have already 
derived in the present chapter. The change in the ensemble repre- 
senting the System of interest will be brought about by makmg changes 
in the values of the external coordinates for the ensemble, and by per- 
mitting the thermal transfer of energy from other ensembles. We have 
found, however, in §§ 124 (a) and (6), that the effect on 5 of a small 
change 8a in an external coordinate a would be given by 

-SH > 0, (130.2) 

where the equality sign would apply to the limiting case of a reversibk 
change. And we have found in §§ 128 (6) and (d) that the effect on E 
of the transfer of a small amoimt of thermal energy of magnitude 8$ in 
the mean from a canonical ensemble vpith the distribution parameter 6 
to an ensemble representing any system of interest would be given by 

-8l ^ (130.3) 

where the equality sign again applies to the limiting case of reversible 
transfer between canonical ensembles having sensibly the same value 
of 6. Moreover, the nature of our derivations was such that it is proper 
to regard the effects of thermal transfer and of changes in external 
coordinates as superposable. 

Hence, by combining (130.2) and (130.3), we are now led, after 
integration, to the desired expression 

-AI (130-4) 

where the equality sign is seen to apply to the limiting case of reversible 
processes of a nature such that all the ensembles and external bodies 
involved could be restored to their original condition. We at once see 
that this is the appropriate statistical mechanical analogue for the 
general statement of the second law of thermodynamics as expressed 
by (130.1), provided we correlate S -with —kS and T vsith djk, as has 
been proposed. 



560 EXPLANATION OF THERMODYNAMICS Chap. XIII 

This, then, completes our statistical mechanical justification for both 
of the two fundamental laws of thermodynamics. We are now provided 
with statistical expressions (119.4) and (130.4) of just the same form 
as those used for the thermodynamic statement of the two laws, with 
the thermodynamic quantities Ej_ 1^ S, and T replaced by their 
statistical mechanical analogues E, Q, W, —kH, and djk; and further 
statistical mechanical deductions can hence now be carried out in a 
form to run parallel to deductions familiar in thermodynamics. 

131. Remarks on the statistical explanation of thermodynamics 

We may now conclude the present chapter by making a few remarks 
concerning the nature of the explanation which the methods of statisti- 
cal mechanics have provided for the principles of thermodynamics. 

The fundamental idea in this explanation lies in regarding the 
thermodynamic behaviour of a single system of interest as equivalent 
to the mechanical behaviour which would be exhibited on the average 
by a suitably chosen ensemble of systems of similar structure. The 
introduction of such an idea, if we are to attempt a mechanical explana- 
tion of the behaviour of thermodynamic systems, seems quite reasonable 
in view of the consideration that the thermodynamic specification of 
the condition of a system is incomplete enough to comport with a wide 
variety of specifications having the precision allowed by mechanics. 
The importance and significance of the idea was appreciated to some 
extent by Boltzmann, but was first fully and completely understood 
and applied by Gibbs. The validity of the idea has been made evident 
by the results which we have given in the present chapter. 

In carrying out the implications of this fundamental idea, we have 
found it necessary to correlate the thermod 3 nnamic variables ordinarily 
used to specify the condition of a thermodynamic system with mechani- 
cal quantities applying to the corresponding representative ensemble. 
In the case of the external coordinates describing a thermodynamic 
system, we have found it possible — paying due regard to the new 
requirements imposed by the quantum mechanics — ^to give all the 
systems in the ensemble the same values for their external coordinates 
as those of the system to be represented. In the case of purely mechani- 
cal quantities, such as the energy or external forces exhibited by a 
thermodynamic system, we have found it possible to make a correlation 
with the average values of these quantities in the representative en- 
semble. Sere it may be remarked that the mean proves the most 
desirable average to take, and that inevitable quantum mechanical as 



§ 131 


CONCLUDING BEMABKS 


S61 


well as deliberate experimental limitations on tbe accuracy of our 
observational information may play a role in the necessity for repre- 
senting such quantities by their mean values. In the case of the 
essentially thermodjmamie variables, entropy and temperature, con- 
siderable investigation was necessary to validate the choice of statistical 
correlates. 

In the case of entropy, except for a constant factor needed to allow 
for choice of units, it was found satisfactory to correlate this thermo- 
dynamic quantity with the negative of the quantity E for the ensemble 
used to represent the system of interest. Several remarks may be made 
in this connexion. 

In the first place, it is to be noted, as has been emphasized in con- 
nexion with our derivation of the £f-theorem, that the ensemble which 
we choose to represent a mechanical system of interest is determined 
by the observations we have made on the condition of that system. 
Thus also the value of E will be determined by the natur^f our know- 
ledge of the condition of the system. Since the value of S' is lower the 
less exactly the quantum mechanical state of the system is specified, 
this provides the reason for the statement sometimes made that the 
entropy of a system is a measure of the degree of our ignorance as to 
its condition. From a precise point of view, however, it seems more 
clarifying to emphasize that entropy can he regarded as a quantity 
which is thermodynamically defined with the help of its relation to 
heat and temperature given by (130.1), and statistically interpreted 
with the he^ of the analogous relation (130.4) between the mechanical 
quantities E, and $. 

An interesting point arises in coimexion with the entropy of systems 
which are not in a condition of thermodynamic equilibrimn. In the 
older thermodynamics it was possible to say that the entropy of a 
system originally in a condition of equilibrium would increase if it 
changed spontaneously into a temporarily unstable condition; for 
example, as a result of opening a valve which would allow fiow into 
an exhausted container. Nevertheless, it was not so easy to make a 
precise assignment of the value for the entropy in such a non-equili- 
brium condition, since precise changes had ultimately to be correlated 
with the help of (130.1) with reversible changes from one condition of 
equilibrium to another, and thi^introduced complications. With the 
help of our correlation S ^ — hE, it is now easier to grasp what is to 
be meant by entropy in general, since the evaluation of 5 is in no wise 
limited to conditions of equilibrium. 

35SS.SS 4 0 



662 


EXPLANATION OF THERMODYNAMICS Chap. XTII 


There is another important point connected with the correlation 

8 ^ -kS = -k^PJogP^. (131.1) 

n 

In the older thermodynamics entropy was regarded as defined only to 
the extent of an arbitrary additive constant, and this circumstance was 
not affected by the interpretation of thermodynamics with the help of 
the classical statistical mechanics. From the point of view of quantum 
mechanics, however^jt will be noted that the situation has been altered, 
since the quantity H is seen to have the value zero when the system 
of interest is known to be in a single definite quantum mechanical state, 
and then to have lower but definitely calculable values for any other 
distribution. Hence the introduction of the correlation (131.1) definitely 
implies that we choose the additive constant for entropy in such a way 
that the entropy of a system will be zero for a pwe state, and greater 
than zero but with a perfectly definite value for any so-called mixed 
state. This provides an explanation for the term absolude erdropy which 
is sometimes used as descriptive of this situation. We shall not make 
use of this term in the present book, however, since this would imply 
a selection of states to be designated as pure that was based on an 
ultimate analysis into nuclear as well as extra-nuclear conditions. In 
the actual practice of physical chemists the specification of states which 
are treated as pure is best regarded as determined by conventions which 
must be applied in a consistent manner throughout the computation. 
Assu mi n g the consistent application of conventions as to the specifica- 
tion of pure states, the important point to emphasize is that the corre- 
lation expressed (131.1) then provides more powerful methods for the 
calculation of entropies and entropy differences than were classically 
available. In the next chapter, which gives some specific applications 
to thermodynamic problems, we shall see that these more powerful 
methods provide a satisfactory explanation for those actual relations 
between entropies which have led to various formulations of the so- 
called Nemst heat theorem or third law of thermodynamics. 

As a final point concerning the coCTelation between entropy 8 and 
the statistical mechanical quantity E, as given by (131.1), it is of 
interest to consider the special case of a system which we regard as 
being with egwit probability in one or another of a group of W eigen- 
states between which we do not distinguish on the basis of our macro- 
scopic measurements. It will be seen that the relation (131.1) would 

then assume the form r, ^ 

8:^hlogW. (131.2) 



§ 131 CONCLUDING REMARKS 563 

This has the form of the relation between entropy and probability 
considered by Boltzmann and Planck, and the quantity W is sometimes 
spoken of as the thermodyTiarriic probability. It is evident, however, that 
(131.2) is best regarded merely as a special case of the generally satis- 
factory and understandable expression for entropy given by (131.1). 

We may now turn to the correlation of temperature T with the 
distribution parameter 6 which is given by the expression 

(131.3) 

Several remarks may also be made in this connexion. 

It is of interest first of all to point out once more that the tempera- 
ture of a system is in any case a quantity to which we can assign precise 
meaning only for systems which are in a condition of equilibrium. Tbia 
corresponds to the circumstance that the parameter 6 characterizes only 
the particular kind of ensembles, with canonical distribution, which we 
use to represent systems in thermodynamic equilibrium. 

It is also of interest to note once more that our introduction of the 
concept of essential isolation has made the canonical distribution, rather 
than any form of microcanonical distribution, seem appropriate for the 
representation of equilibrium in the case of a system which would be 
regarded as isolated in a thermodynamic treatment. This now removes 
the motive which led Gibbs to attempt his second and third analogies 
for thermodynamic quantities using microcanonical ensembles. The 
analogies thus obtained for temperature were specially unsatisfactory, 
as Gibbs himself appreciated. 

Finally, it may be emphasized that our present method of correlating 
temperature T with the parameter 6 is very satisfactory firom the 
thermodynamic point of view of the role played by l/T as an integrating 
factor for the heat dQ absorbed by a system in an infinitesimal re- 
versible step. In our treatment of the statistical mechanical analogue 
of the second law of thermodynamics, as given in § 130, we have seen 
tlmt 1 16 plays a similar role as an integrating factor for the mean energy 
transferred by reversible thermal action to the representative en- 
semble for the thermodynamic system of interest. Our present method 
ofi connecting temperature with statistical mechanics may hence seem 
more fundamental than our earlier introduction of temperature into the 
formulae for the Maxwell-Boltzmann, Einstem-Bose, and Fermi-Dirac 
distributions with the help of the phenomenological properties of a 
perfect gas thermometer. Of course, it will in any case be necessary 



664 


EXPLANATION OF THEBMODYNAMICS Chap. XIII 


to determine the numerical value of the constant To, involved in the 
relation of the quantities T and d, by considering the properties of 
some specific system; and for this purpose at least it is convenient to 
consider the properties of a perfect gas as will be done in the next 
chapter. 

In concluding this chapter, it is hoped that due appreciation will be 
felt for the importance and significance of the great achievement of 
statistical mechanics in providing a fundamental, mechanical inter- 
pretation and explanation of the principles of thermodynamics. 



XIV 

FURTHER APPLICATIONS TO THERMODYNAMICS 

132. Thermodynamic quantities in terms of the free energy 

It is evident, as shown in the last chapter, that the principles of 
statistical mechanics are sufficiently fundamental to provide a mechani- 
cal explanation for the phenomena of thermodynamics. It may he 
emphasized, however, that the principles of statistical mechanics are 
not only more fundamental but are also more powerful than those of 
thermodynamics. This greater power arises from the circumstance that 
statistical mechanics gives due consideration to the microscopic, in- 
ternal structures of the systems to be treated, while thermodynamics 
concerns itself solely with their macroscopic, phenomenological be- 
haviour. This makes it possible, by applying the methods of statistical 
mechanics, to obtain definite theoretical values for quantities such as 
pressures, specific heats, energy, and entropy differences, which would 
have to enter a purely thermodynamic treatment as parameters with 
values needing empirical determination. 

In this final chapter we shall now consider several specific applica- 
tions of statistical mechanics to thermodynamic systems which will 
illustrate this possibility of deepened theoretical treatment. The sys- 
tems which we choose for this purpose will be the simplest ones of 
ordinary physical-chemical interest, and, in accordance with the general 
purposes of this book, we shall make no attempt to give an account of 
the many further possible and important applications of statistical 
mechanics to thermodynamics that can be made. 

In addition to a consideration of such illustrative applications, it will 
also be necessary to pay attention to two questions of more fundamental 
character. In § 140 we shall give an account of the grand canonical 
ensemble as providing the appropriate statistical apparatus for treating 
equilibria involving changes in the amount of material which constitutes 
a system of interest. This is necessary in order to have a justification for 
the thermodynamic treatment ordinarily given to so-oaEed ‘open’ rather 
than ‘closed’ systems. Finally, in § 141 we shall give treatment to the 
fluctuations in the properties of a thermodynamic system which can be 
expected at equilibrium. This is not only important in determining the 
extent of validity which can be ascribed to the thermodynamic procedure 
of assigning precise values to thermodynamic quantities, but can also be 
of direct observational interest xmder suitable circumstances. 



666 FURTHEB APPLICATIONS TO THERMODYNAMICS Chap. XIV 


We must begin the chapter by recalling some simple thermodynamic 
formulae which will be needed. We may confine our attention for the 
present to situations simple enough so that the different equilibrium 
conditions of the thermodjmamic system under consideration can be 
specified by its temperature T and its volume v. In accordance with 
the first law of thermodynamics, we can then write for the change in 
energy E, when a small change is made in those variables, 

dE = dQ-dW 

= dQ-pdv, (132.1) 

where p is the pressure which the system exerts. And, in accordance 
with the second law of thermodynamics, we can then write for the 
change in the entropy S of the system 


dE+pdv 

w 


(132.2) 


We shall be specially interested, however, in the relation of different 
thermodynamic quantities to the free energy of the system, owing to 
the simple connexion which we have found between this quantity and 
the statistical mechanical quantity which we have called the sum-over- 
states. For the energy of the system A we can write, in accordance 
with the original definition of Helmholtz,| 

A = E-TS. (132.3) 

By differentiating this expression, we then obtain in general 

dA = dE—Td8—8dT. (132,4) 

And, by substituting (132.2), vye obtain for the simple kind of situa- 
tions now under consideration 


dA = —pdv—8dT. (132.6) 

This now provides what is necessary in order to express the various 
thermodynamic quantities of interest in terms of the free energy A. 
In accordance with (132.6), we immediately obtain for the pressure and 
entropy of the system „ > 

(132.6) 


and 


8=-^A 

dT' 


(132.7) 


t This quantity is not to be confused with the thermodynamic potential 

F = F-\-pv—T3, 

sometimes called free energy by chemists. 



§ 132 EXPRESSIONS FOR THERMODYNAMIC QUANTITIES 567 

And, making use of the original equation (132.3) itself, we then obtain 
for the energy of the system 

^ = (132.8) 

and hence for its heat capacity at constant volume 

= (132.9) 


133. Thermodynamic quantities in terms of the sum-over-states 
We are now ready to introduce statistical mechanical considerations 
with the help of the relation (122.9), already derived as giving the 
appropriate connexion between the thermodjmamic quantity free 
energy A, and the statistical mechanical quantity sum-over-states Z. 
This relation has the form 

A = —kTlogZ, (133.1) 

where the sum-over-states is defined by the expression 


Z = X 

n 

(133.2) 

the quantities M„, being the energy eigenvalues for aU the different 
steady state solutions n of which the system is capable.f 

Substituting (133.1) in our previous purely thermodynamic relations 
(132.6) to (132.9), we can now write for the pressure, energy, heat 
capacity, and entropy of the system, in the order named. 

p = 

^dv 

(133.3) 


(133.4) 


(133.6) 

8 = k\ogZ+kT^-^ = §i+klogZ. 

(133.6) 


With the help of these expressions we can now obtain explicit expres- 
sions for the quantities listed in the case of any system for which we 
have the knowledge of structure necessary for calculating the sum- 
over-states Z as given by (133.2), 

t It may again be pointed out that the mim-over-atates Z provides in the present 
treatment the computational apparatus which is provided in the treatments of Darwin 
and Fowler by their partition functions. 



568 FURTHER APPLICATIONS TO THERMODYNAMICS Cliap. XIV 

We have picked pressure, energy, heat capacity, and entropy as a 
list of thennodynarnic quantities which we shall wish to evaluate for 
various typical systems, since these particular quantities provide the 
usual starting-point for the thermodynamic computations actually 
carried out in practice by physical chemists. The expressions for such 
a list of quantities wiU, of course, contain no information not already 
implied by an expression for Z in terms of v and T. Nevertheless, 
having once obtained a set of expressions for p, E, C^, and S in terms 
of V and T, it will then be natural to follow ordinary thermodynamic 
procedure and calculate further quantities without returning to the 
statistical mechanical basis from which we have started. Thus, as 
examples, we may calculate the free energy A itself from 


1 

II 

(133.7) 

the thermodyimmic potenticd F from 


F = E+pv-TS, 

(133.8) 

or the heat capacity ai constant pressure Op from 



(133.9) 


once we have evaluated the previous expressions in terms of v and T. 

The expression for the energy of a system E provided by (133.4) will 
give values based on an energy zero-point which will itself be deter- 
mined by the energy zero-point selected for the eigenvalues of energy 
E^ used m evaluating the sum-over-states in accordance with (133.2). 
On the other hand, the expression for the entropy of a system 8 pro- 
vided by (133.6) will give values, as noted at the end of the preceding 
chapter, based on the definite zero-point implied, in correlating entropy 
with statistical mechanical quantities by taking 

8 = -TcE = -fe 2 -P^logP^ (133.10) 

n 

without including any additive constant. This makes the entropy zero 
for any pure state — that is, when the probability for some particular 
quantum mechanical state n is known to be unity. We shall have 
occasion to direct specific attention to these zero-points in what follows. 

134. Sum-over-states as dependent on molecular states 
It win be seen from the foregoing that definite numerical values can 
be obtained for quantities of thermodynandc interest as soon as we can 
obtain an appropriate evaluation of the sum-over-states, 

Z = ^ 


(134.1) 



SUM-OVEB-STATES 


669 


§ 134 


for the system of interest. The quantities occurrmg in the expres- 
sion for the sum-over-states Z are the energy eigenvalues for the 
different steady states that the system as a whole can exhibit. In 
the case of systems composed of weakly interacting elements, such as 
a dilute gas where the energy levels of the individual molecules are not 
greatly affected by colUsions, the calculation of Z can be conveniently 
made to depend on the energy eigenvalues e^. for the individual elements 
of which the system is composed. 

In making such a calculation to determine the dependence of the 
sum-over-states for the system as a whole on the states of the individual 
elements of which the system is composed, it is necessary to distinguish 
between cases where the specification of the state of the system as 
a whole depends on which particular elements are assigned to their 
different elementary eigenstates, and oases where the specification is 
only dependent on the number of elements assigned to different ele- 
mentary eigenstates. We may begin by treating the first of these two 
cases. 


(a) Case of permanently distinguishable elements. Let us consider 
a system composed of n similar, weakly interacting elements; and let 
each element exhibit the same set of states of energies but let the 
elements be distinguishable one from another so that an eigenstate of 
the system as a whole must be specified by giving the elementary 
eigenstate for each of the n different elements. In accordance with 
our previous nomenclature such a system may be called a Maxwell- 
Boltzmann system of elements. As an example we could take a system 
of n oscillators all exhibiting the same frequency v, but distinguishable 
from each other by pemaanent spatial location or orientation. 

If we now use the indices i^, ig,..., i„ to denote the eigenstates of 

these individual elements, it is evident for the case of weak interaction 
that the energy for a state of the system as a whole could be expressed 


in the form 






(134.2) 


and that each different assignment of values to the indices %, ig,..., 
would correspond to a different eigenstate of the system as a whole. 
Hence, from the definition of the sum-over-states given by (134.1), we 
can now express that quantity in the form 


Z = X .« ^ 

‘tu'hi—Ai 


(134.3) 


where the summation is to be taken over all possible values of the 
indices It will be immediately seen, however, from the 

3595.26 4 2) 



670 FUETHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


possibility of decomposing the quantity to be summed into a product 
of factors of the form and from the assumed identity of the 

energy spectra for the different elements, that this can bo rewiitten in 
the form _ 


2 = 


where the quantities are the energy eigenvalues for the different 
eigenstates i of a single Moment of the kind under consideration. In 
carrying out the indicated summation over i, each separate eigenstate 
is to be considered. In the case of degeneracy, where several eigen- 
states may have the same eigenvalue of energy, tlie above foi-tnula can 
be rewritten if desired in the form 


2= (134.6) 

where denotes the number of eigenstates which happen to have the 
same energy e^, and the indices i are now used to distinguish merely 
between different energy levels. We thus obtain the desired expressions 
for the sum-over-states for a Maxwell-Boltzmann system composed of 
n weakly interacting elements. 

(6) Case of Einstein-Bose and Fermi-Dirac geises. We may now turn 
to the treatment of systems composed of similar elements, where the 
eigenstates for the system as a whole are to be specified by stating 
merdy the number of elements in their different possible individual 
eigenstates, without any distinction as to which elements are so 
assigned. As examples of such systems we have Einstein-Bose and 
Fermi-Dirao gases which exhibit eigensolutions which are respectively 
either symmetric or antisymmetric in the indices denoting the particles 
composing the gas. For the purposes that we now have in view, we 
shall be specially interested in the sum-over-states when the phenome- 
non of gas degeneration can be neglected. The effects of this pheno- 
menon are negligible for ordinary gases tinder usual circumstances, and 
have already been treated by appropriate methods for the special oases 
where they become important. In the absence of appreciable degenera- 
tion we shall find that the calculation then leads to a very simple result. 

Let us consider a dilute gas composed of n similar, weakly inter- 
acting molecules, each exhibiting the same set of states of energies 

Using the symbols ej, e„ to denote possible values for the 
energies of the n molecules composing the system, we can then write 

E = (134.0) 



§134 


SUM-OVER. STATES 


671 


as an expression, in the case of weak interaction, for a possible value of 
the energy of the system as a whole. Noting that the total number 
of terms on the right-hand side of this expression is equal to n, and 
using Wj,, to denote the numbers of repetitions in which dif- 

ferent terms e refer to the same molecular state, it is evident that the 
above value for E could be obtained in 


(134.7) 


n\ 

different ways, by merely permuting the assignment of particle indices 
1, ..., n to the energies e. In the immediately preceding consideration of 
systems composed of distinguishable elements we regarded such dif- 
ferent permutations as leading to different states of the system as a 
whole. In our present consideration, however, we know from the sym- 
metry properties of the eigensolutions for the system as a whole that 
such permutations do not lead to new states. Hence we can now 
evidently write as an appropriate expression for the sum-over-states 
of a dilute gas composed of n similar molecules 


Z 


= 2 








n\ 


gi+ga+«.. + €’n 


(134.8) 


where the factor before the exponential term takes care of the circum- 
stance that the indicated summation would introduce the same states 
more than once. In applying this expression to Einstein-Bose gases, 
we must sum over all possible values of ... e„. But in applying it to 
Eermi-Dirac gases, we must eliminate, in accordance with the Pauli 
exclusion principle, those terms in which two or more energies, ... 
refer to the same state, since there is then no corresponding eigenstate 
for the system as a whole. 

We must now consider the simpMcation of this general expression, 
under conditions of negligible degeneration, when the temperature of 
the gas is high and its volume large. Under such circumstances most 
of the terms, which make important contributions in the summation 
indicated by (134.8), would be such that the same state is not assigned 
to more than a single molecule, since many different states are available 
for the molecules without going to such high total energies as to make 
the exponential term negligibly small. This has two effects. In the 
first place it permits us to neglect the fact that the summation as 
written contains a few terms which ought to be excluded to get a pre- 
cisely correct result in the Eermi-Dirac case, and in the second place it 
permits us as an approximation to take the factorials ... as 



672 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


all having the value unity. Hence we can now write as a satisfactory 
approximation, for either kind of gas, in the absence of appreciable 
degeneration, , «.+c, +...+*. 

where we now take all possible values € for the energies of the n particles. 

Noting once more the possibility of decomposing the quantity to be 
summed into a product of factors of the form this can now 

be rewritten in the form 



(134.10) 


where the summation is to be taken over all eigenstates It for a single 
particle or molecule of the kind in question. We thus obtain the desired 
simple expression for the sum-over-states in the case of a dilute gas of 
n similar molecules, in the absence of appreciable degeneration. If 
desired, in the case of multiple energy levels, the formula can also be 


written in the form 



(134.11) 


where the summation is now over energy levels rather than explicitly 
over individual states. 


135. Perfect monatomic gas 

(a) Sum-over-states. We are now ready to apply the methods which 
we have developed above to investigate the properties of specific 
thermodynamic systems. We shall first treat the case of a perfect 
monatomic gas. 

Let us consider a gas composed of n similar particles, enclosed in 
a container of volume v, and having the temperature T. Under the 
conditions of negligible degeneracy, necessarily holding in the case of 
perfect gases, we can then write for the sum-over-states 

(135.1) 

where the indicated summation is to be taken over the individual states 
k for a single particle of the kind Tinder consideration. To evaluate the 
summation, we may make uSe of our previous expression (71.16) for 
the number of eigenstates dG for a single particle in the range e to 
e-f-dc. This can be written in the form 

da = ^ (2m)Md€, 



(136.2) 



§135 


MONATOMIC GASES 


573 


where m is the mass of the particle, and, for the sake of generality in 
case the particle has an intrinsic angular momentum, g stands for the 
number of eigenvalues of the component of angular momentum parallel 
to a selected axis. With the help of this expression we can now replace 
summation by integration and write 


GO 

2 e-**/*2’ = (2m)3 J de = ^ (^kT)\ (136.3) 

0 

with the help of the known value for the definite integral. Substituting 
(135.3) in (135.1), we then obtain for the sum-over-states of a perfect 
monatomic gas , . 

Z = ■ (‘“-I) 

In our actual applications we shall be interested in the logarithm of 
this quantity. Introducing the Stirling appro3±nation 


nl = ^(2iTn)l- 


(136.6) 


for the factorial of the very large number n, we then obtain 


logjZ = nlog^(27rmkT)^—nlogn-jrn—logJ(2TTn) 

ft® 

= ?i|^log^(2iTmifc!r)S-ilogV(27m)j. (136.6) 

Hence, except for a term which goes to zero as the number of particles 
n increases, we can write 

log Z = wlog^j {2^1tT)K (135.7) 


(6) Thermodynamic quantities. We may now substitute this expres- 
sion for log.Z in our previous expressions (133.3) to (133.6) for the 
pressure jp, energy E, heat capacity C^, and entropy 8 of any system. 
We thus obtain for a perfect monatomic gas, with a minimum of com- 
putation, the important collection of results 


p = 


nhT 

V 


(136.8) 


E = ^kT, (136.9) 

= fft*, (136.10) 

8 = nk\og^ {2mmkT% (136.11) 



674 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 

The simplicity with which these results have been obtained illustrates 
the powerful character of OTir present methods. The first three expres- 
sions for pressure, energy, and heat capacity are of course very familiar 
ones. 

The last of the equations gives the so-called Sackur-Tetrode expres- 
sion for the entropy of a perfect monatomic gas, modified merely by 
the inclusion of the factor g, to allow for the possibility that the transla- 
tional states of the particles composing the gas may, from a modem 
point of view, have a gr-fold degeneracy, owing, for instance, to the 
presence of intrinsic angular momentum. It will be noted, in accordance 
with the discussion at the end of § 133, that our present treatment of 
thermodynamic properties from the standpoint of a gwmtum statistical 
mechanics leads to a perfectly definite value for the entropy of the gas 
in the condition considered, based on a starting-point which would 
make the entropy zero if the system were known to be in a single 
definite quantum mechanical state. 

(c) Evaluation of constant h. So far we have assigned no definite 
numerical value to the constant h which was introduced into the corre- 
lations of entropy and temperature with mechanical quantities, 

8^-kE and T (135.12) 

tC 

in order to allow for difference in units. With the help of the expression 
for the pressure of a perfect gas given by (136.8) we now see, however, 
as indicated by the notation chosen, that this quantity actually is the 
so-called Boltzmann constant 

* = (135.13) 

vdiere jB is the gas constant per mol and N is the number of molecules 
in a mol. 

Using the system of units which depends on the centimetre, gram, 
second, and degree centigrade, the quantity in question has the valuef 
To — 1-380 X 10-1® erg/deg. (135.14) 

136. Perfect gases composed of more complicated molecules 

(a) Sum-over-states. We must now consider gases composed of 
similar elements which are more complicated than simple particles. 
The energy eigenvalues for a single element can then depend not only 
on its translational state but on its internal state as well, and we have 

t Siige, Phys. Rea. Supplement, 1, 1 (1929), modified to ooirespond to the new value 
4-803 X 10-“ e.s.u. fore. 



§136 


MORE COMPLICATED GASES 


575 


to allow for this in computing the actual value of the sum-over-states. 
The case of greatest practical interest is provided by gases composed 
of molecules, containing more than a single atom, which exhibit dif- 
ferent internal states of rotation and nuclear vibration. The methods 
of treatment are general enough, however, to apply also when it be- 
comes necessary to consider the different internal electronic states 
which can be excited both in the case of molecules and single atoms 
as well. 

Let us consider a gas composed of n similar, chemically non-inter- 
acting molecules, enclosed in a container of volume v, and having the 
temperature T. Under conditions of negligible gas degeneration we can 
then again write, in accordance with (134.10), 

' mi 


as a general expression for the sum-over-states, where the different 
states of a single molecule in the container are designated hy the letter 
m. In order to evaluate the indicated summation over the states m, 
however, we must now consider both the translational and the internal 
condition of the molecule. To carry this out it is convenient to regard 
the state of the molecule as specified by two letters k and i, referring 
respectively to the translational and internal conditions. Except for 
a negligible relativity effect, we can then take the total energy for a 
state hi as equal to the sum of a term c*. depending solely on the index 
k in the same way as for simple particles, and a term which depends 
solely on the internal condition, i.e. we can take 

(136.2) 

Our expression for the sum-over-states is then seen to assume the form 




g— 


(136.3) 


For the purposes of practical computation, it is usually more con- 
venient to rewrite this expression in a form 




which gives explicit recognition to the possibility that any given in- 
ternal energy level including the lowest may be degenerate. The second 
summation is now over different energy levels i, being the number 
of separate states having the same internal energy e^. Taking the 



576 FURTHER APPLICATIONS TO THEKMODYNAMICS Chap. XIV 


logarithm of this quantity, and noting our previous value (135.7) for 
monatomic gases where Zi corresponds to the first two factors in (136.4), 
we then obtain 

log Z = «log^^(27rjnJfc7’)*+wlog ^ (130.5) 


as the desired expression for the sum-over-states. 

(6) Thermodynamic quantities. Substituting this expression for Z 
into equations (133.3) to (133.6), we now obtain for the pressure, energy, 
heat capacity at constant volume, and entropy of a dilute gas com])osed 
of n molecules of mass to, in volume v and at temperature T, 


nhT 
P = , 

V 

0^ = ^h+2nhT^\og I | 


(136.6) 

(136.7) 

(136.8) 


S = »Jfclog-^^(277TOjfcr)^-f»^!log2g'ie-®rf*^-|-«jfc7’7^1og 2 
nhr i ol % 


(136.9) 


The first of these formulae, (136.6), is that for the pressure of a perfect 
gas in agreement with the circumstance that we have treated gases at 
high enough dilutions and temperatures so that the phenomenon of 
gas degeneration and the effects of collision on energy levels could bo 
neglected. 

The second formula, (136.7), may be regarded as expressing the total 
energy of the gas as a sum of translational and internal parts. This will 
be better appreciated by carrying out the indicated differentiation in 
the last term of (136.7) and re-expressing the energy in the form 

= ( 136 . 10 ) 

where the first term is a well-known expression for the mean kinetic 
energy of the n molecules, and the second term is seen to be an appro- 
priate expression for the mean internal energy under conditions where 
the Maxwell-Boltzmaim distribution over internal states i can be taken 
as holding. The actual value that we obtain for the energy E will 
depend of course on the energy zero-point that we select. To examine 
this let us denote the energy assigned to a molecule at rest in its lowest 



§136 


MORE COMPLICATED GASES 


677 


internal state by €q. Introducing terms which, would mutually cancel, 
we may then rewrite our expression (136.10) for energy as a sum of 
three terms (€*-€) 

E = — , (136.11) 

which are respectively seen to give the energy of the molecules at rest 
in the ground-level, their kinetic energy of translation, and the internal 
energy that has been excited. In the practical computations of physical 
chemists the energy e, assigned to the ground-level may be taken as is 
convenient for the problem at hand. For example, it may be taken as 
zero, or as the potential energy necessary to remove a molecule from 
some condensed phase of interest, or as the energy involved in forming 
the molecule from its atoms, in accordance with the nature of the 
problem under consideration. 

The third of the formulae given above, (136.8), is that for the heaA 
capacity of the gas at constant volume as obtained by differentiating 
the expression for energy with respect to temperature. The result is 
of course independent of any assignment made to the energy of the 
ground-level. 

The last of the above formulae (136.9) gives the entropy of the gas 
based on a starting-point which assigns zero entropy to any pure 
quantum mechanical state. By comparicg with (136.7), it can also be 
written in the form 

S = ^ +nh log (^nmlcT)*-\-nh log 2 gt ( 1 36. 12) 

T nnr i 

which sometimes proves convenient. And by introducing terms which 
mutually cancel, this can be re-expressed in a form 

S = :^:^+«A!log^(27rm«!!r)»+n*log | (136.13) 


which only involves energies reckoned from the ground-level up. This, 
then, shows clearly that our expressions for entropy are independent 
of the selection of energy zero-point, in agreement with the circum- 
stance that they have their own starting-poiat of zero entropy for any 
pure state as mentioned above. For the purposes of the physical 
chemist, it is often more convenient to have the dependence of entropy 
on pressure rather than on volume, as can be obtained by substituting 


nhT 

V = 

P 


(136.14) 


8596.25 


4E 



678 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 

in accordance with the gas laws. It is also convenient to rewrite the 
last term of (136.13) in the form 

nh\og T g^e = nhlog V —e~ ^T"+nk\oggQ, (136.15) 

where g^ is the multipUoity of the ground-level, in order to .separate 
out the part nhloggf^ which does not go to zero at low temperatures. 
Our expression for entropy (136.13) then assumes the form 

8 = +lnk log T—nh \ogp-\-nh log ^ ^ ® **' + 

(136.16) 

where we have rearranged in a manner to segregate constant and 
variable terms. 

(c) Energy and entropy of actual monatomic and diatomic gases. The 
application of the foregoing formulae for energy and entropy to actual 
gases is of course dependent on the requisite knowledge of the energy 
levels and quantum weights g^ for those states of the gas molecule 

which will introduce appreciable terms in the summation T at 

£ 

the temperature under consideration. Such information as to energy 
levels and quantum weights must be obtained in general from spectro- 
scopic data and its quantum mechanical interpretation. 

In m a .king applications it is often necessary to undertake a detailed 
specific treatment for the particular kind of gas molecule involved, in 
order to obtain high precision and to take account of all possible com- 
plicating factors. We may illustrate the methods, nevertheless, by 
giving general — ^although to some extent approximate and over-simpli- 
fied — ^treatments for the cases of gases composed of monatomic and of 
diatomic molecules. In the case of polyatomic gases it is best at the 
present time not to try to develop general formulae, but to make 
specific calculations for each particular case, on account of the complex 
behaviour of molecules contaming more than two atoms. We shall be 
specially interested in treating the expressions for the energy and 
entropy of monatomic and diatomic gases in a manner Hitni1fl.r to that 
commonly employed in physical-chemical computations. 

Let us first consider an actual monatomic gas, composed of n atoms, 
having a ground-level which may be degenerate, and having higher 
electronic levels which might become appreciably excited. We may 
then use the general s37mbols and to denote the energies and degirees 



§136 


ACTUAL MONATOMIC GASES 


579 


of degeneracy of all the different electronic levels i including the ground- 
level, and may introduce the special symbols €q and to denote when 
necessary the particular values of those quantities for the ground-level. 
The energy €q assigned to the ground-level may be, for example, the 
potential energy which a free atom would have with reference to re- 
moval from a crystal of the substance at very low temperatures. The 
degeneracy Qq of the ground-level may be due to the presence of nuclear 
spin or of unresolved electronic states. 

For the practical computation of the energy of such a monatomic gas 
it then proves convenient to take our previous expression (136.11) 

? (g<-go) 

g^e ^5^(6^-eo) 

- - -r'-- , (136.17) 

where the evaluation of the internal energy as given by the last term, 
will have to be made, with the help of spectroscopic information, by 
actually summing over all levels i in the numerator and denominator 
which make an appreciable contribution at the temperature T under 
consideration. In making these summations, it will be noticed that the 
term corresponding to the ground-level, = e^, disappears from the 
numerator but not from the denominator of the above expression for 
internal energy. As a consequence it will be seen that the internal 
energy would approach zero at temperatures low enough so that the 
energy (ci— co) needed for exciting the first level above the ground-level 
is large compared with kT. For many actual monatomic gases this 
means that internal energies can be neglected up to very high tem- 
peratures. This is not always possible, however. For example, in the 
case of the monatomic halogens the two lowest electronic levels form 
such a close doublet that the excitation of the upper level must be 
considered at those temperatures where the dissociation of the diatomic 
into the monatomic gas woxild be studied. 

For the practical computation of the mt/ropy of monatomic gases it 
proves convenient to take our previous equation (136.16) 

T—nklogp-\-nklog V -j- 

( 136 . 18 ) 

tv 

where the next to the last term would now have to be evaluated on 
the basis of specific information as to the electronic levels e^. It will 



680 FURTHEB APPLICATIONS TO THERMODYNAMICS Chap. XIV 


be noted that this term would go to zero at temperatures low enough 
BO that the energy of the first excited level (ej — €q) would be large 
compared with kT. Hence the term can often be neglected. For values 
of this term in the case of the monatomic halogens where it proves 
important, and for values of the multiplicity of the gi‘ound-level in 
the case of a number of monatomic gases, reference may be made to 
Fowler’s Statistical Mechanics.'f 

Let us now turn to a consideration of the energy and entropy of 
diatomic gases, composed of molecules capable of existing in various 
states of internal rotation, vibration, and electronic excitation. Precise 
values of the energy and entropy of such a gas at any desired tem- 
perature could of course he computed in any specific case with the help 
of the requisite precise spectroscopic data. It will be of advantage, 
however, also to consider a somewhat general approximate method of 
treating the energy and entropy of such gases in the temperature range 
where their rotational levels can be regarded as fuUy excited. 

To obtain such a treatment we may fitrst recall the results which were 
derived in § 74 (d) for a simple model of a diatomic molecule consisting 
of two dissimilar particles, without spm, which are regarded as held 
together by an attractive force obeying Hooke’s law. For the energy 
levels of such a model we obtained the approximate expression 

(136.19) 


where the rotational and vibrational quantum numbers J and v can 
assume the values 


J" = 0, 1, 2, 3,..., « = 0, 1, 2, 3,..., 


(136.20) 


and I and v are the moment of inertia and the classical vibration fre- 
quency of the molecule when in its equilibrium configuration. Further- 
more, for the multiplicity or quantum weight of the various levels we 

obtained the result ^ o mocoi\ 

(136.21) 


Actual diatomic molecules are of course more complex than the above 
simple model. They consist not only of a pair of nuclei, but of sur- 
roundiog electrons as well, and the nuclear particles are not necessarily 
dissimilar and without spin. Hence there are many complicating factors 
which must be considered in the treatment of actual molecules. Never- 
theless, in agreement with the foregoing, it is often possible to represent 


*[* Fowler, Statistical Meohanics, second edition, Cambridge, 1936, pp. 218-20. For 
the oonnesion between Fowler’s notation and ours, see footnote j* on p. 608. 



§ 136 ACTUAL DIATOMIC GASES 681 

the internal energy of a diatomic molecule with sufficient approxima- 
tion by the formula 

= ^0+ +^ve, (136.22) 

where the first term cq gives the internal energy which we assign to the 
molecule in its ground-level — including therein the corresponding half- 
quantum of vibrational energy — ^the second term gives the rotational 
energy corresponding to the quantum number J, and the last term 
represents the energy associated with the state of vibrational amd elec- 
tronic excitation denoted by the subscript ve. Both of these latter 
terms will go to zero for the groimd-level denoted by «/ = « = e = 0. 
Furthermore, it will often be possible to represent the quantum weight 
of any level Jve by the formula 

Srj«=(2^+l)9'«e. (136.23) 

where (2J-fl) is the multiplicity of the rotational level J, and is 
the multiplicity of the vibrational and electronic state m. It is to be 
noted that will not necessarily go to unity in the ground-level, since 
this may be degenerate on account of nuclear spins or unresolved 
electronic states. 

Assuming the satisfactory character of the above expressions for the 
energy levels and quantum weights of our diatomic molecule, we 
can then write for the logarithm of a needed summation over the 
internal states of the molecule i 

(€< ■■■" €o) 

= log 2 (2*l'+l)fi'we *37 

i 

= log 2 (2'^+ 1 )e +log 2 S'm e 

J 

= log 2 (2«1'+ 1 )e +log 2 {9Jdo)^ +log fl'o. 

J 

(136.24) 


where in the last form of writing we have factored out the quantum 
weight of the ground-level in order to separate the constant term 
logfl'o from the remainder of the expression which goes to zero at low 
temperatures. Furthermore, at sufficiently high temperatures so that 
we can regard the rotational levels as fully excited, that is, at tern- 




682 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


we can replace summation by integration by putting 


y (2J+l)e = f (2J+l)e 

r^n J 


J(J+l)h* 
- S^IkT 


(136.26) 




In the case of a diatomic molecule, where the two nuclei are dissimilar, 
all the levels J = 0, 1, 2,... could be occupied and the above would 


give us 


J(J+i)h' Sir^IkT 

2(2/+l)e = 


(136.27) 


7 

as a reasonable approximation.f In the case of a diatomic molecule 
where the two nuclei are simila/r, only alternate levels could be occupied, 
and we can then take 

2(2J+l)e “2~P~ (136.28) 


SiS a reasonable approximation. Hence, returning to (136.24), we can 
now write in general, for the internal states of our molecule at suffi- 
ciently high temperatures, 

log 2 fiTi e ^ = log — 7 3 log a+log 2 iffvJdo)^ +log !7o. 

i v,e 

(136.29) 


where the symmetry nvmber a assumes the value 1 or 2 according as 
the two nuclei axe dissimilar or alike. 

Betuming to our general expression (136.11), and making use of 
the result of differentiating (136.29) with respect to T, we can now 
write for the energy of a diatomic gas, at temperatures T high enough 
so that the rotational levels can be regarded as fully excited, 


■tvJkTg 

7X?C! - ^Vt 

«.e 


(136.30) 


The first term ne^ gives the energy that we ascribe to the n molecules 
in the ground-level. The second term gives the sum of their transla- 
tional energy fnfcT, and the additional rotational energy nkT. The 
last term gives the further energy to be ascribed to the excitation of 
the vibrational and electronic levels ve. This term would tend towards 
zero at sufficiently low temperatures, and for higher temperatures could 
be computed from a knowledge of spectral data. 

Turning now to our general expression for entropy (130.16), sub- 


t The above approximation could be slightly improved by integrating from — J to oo. 
For a treatment of rotational levels based on a more precise formula than (136.27), see 
Giauque and Overstreet, Joum. Amer, Chem. Soc, 54, 1731 (1932). 



§ 136 


ACTUAL DIATOMIC GASES 


583 


stituting (136.29), and segregating variable and constant terms, we 
obtain for the entropy of a diatomic gas, at sufl&ciently high temperatures, 

S = {E--n€^IT+lnk\og T—nl€\ogp-\-nhlog ^ 

+7iifcloge(27rtn.)*fe*-^^— , (136.31) 
nr a 

where the next to the last term, which is seen to go to zero at low 
temperatures, would still have to be evaluated from spectral knowledge 
as to vibrational and electronic excitation levels. The factors in the 
final constant term, which refer to the specific diatomic molecule under 
consideration, are the mass m, moment of inertia I, degeneracy of the 
ground-level and symmetry number a. For the values of and a 
in the case of a number of diatomic gases, and for the effect of the 
next to the last term in (136.31) in the case of NO where it proves 
important, reference may be made to Fowler’s Statisticdl Mechanics.'f 

137. Crystals composed of a single substance 

(a) The modes of vibration of a crystal. We may now turn our con- 
siderations from the gaseous state characterized by the tendency 
towards a random distribution of component particles, to the crystalline 
state characterized by the tendency towards a highly ordered arrange- 
ment of component particles. We must begin by concentrating our 
attention on the positions and motions of the particles composing the 
crystal, leaving the possibilities for actual particles to assume different 
internal states for later inclusion in the theory. 

Let us consider, at first from a classical point of view, a crystal 
composed of n similar particles of mass m capable of performing small 
oscillations about a set of equilibrium positions which are determined 
by the nature of the forces acting between the particles. These equili- 
brium positions inside the crystal structure may be specified as to 
their permanent relative spatial arrangement by the set of numbers 
1( 2, 3,..., Hi. 

To treat the condition of the crystal we may now introduce, corre- 
sponding to each particle, a set of three Cartesian coordinates 

(ohVlHh («2ys*2).-, (137.1) 

to denote the position of the oscillating particle associated with each 
equilibrium position. It may be noted for later reference that the 

t Fowler, loo. oit., pp. 220-6. For the connexion between Fowler’s notation and 
ours, see footnote t on p. 608. 



FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 

Loes 1, 2, 3 ,..., n thus assigned are determined by the permanent 
tive positions of the equilibrium positions. Hence, in accordance 
a the discussions of § 76, when we come to the application of 
ntum mechanics no solutions will have to be eliminated on the basis 
ymmetry restrictions. 

Ising the above coordinates, the energy of the crystal could then be 
ressed in the form 

E — iw(asj4-2/|+...+^)4-F(a:i,yi,...,2!j+const., (137.2) 

jre the first term gives the sum of the kinetic energies of the particles, 
second term gives their mutual potential energy, and the last term 
ws for any desired later choice of the energy zero-point. The explicit 
iression for the potential energy F appearing in this equation would, 
rever, depend in a very complicated way on the coordinates x^, 
and a change to new coordinates will be convenient. Assuming 
oscillations of the particles small enough so that the potential energy 
:he particles could be regarded as a quadratic function of their dis- 
cements firom the equilibrium positions, a transformation to so-called 

mud cowdinaies q^, qsn I*® introduced, following 

1-known methods, f which will make it possible to rewrite the 
Tgy in the form 

U = i{mj4l+mzql+...+rn^qln)+i{bi9i+hsl+---+Ki.Qsn)+^o> 

(137.3) 

ere the m’s and &’s are constants, and Eq is the zero-point energy 
it we assign to the crystal in the absence of osciUational excitation. 

by introducing the momenta ... p^ which correspond to our 
sent coordinates, this can also be written in the desired Hamiltonian 
m 

H = i(aiJ??+a2l’l+-+«8»l>i«)+i(6i?!+622i+-+63na'iJ+-®o- 

(137.4) 

?he a’s and 6’s appearing m this final expression for the clflaaiftal 
miltonian are constants. The coordinates ... q„ are linear func- 
is of the original coordinates ... Six of these coordinates can 
taken as determining the position and orientation of the crystal as 
^hole, so that the corresponding coefficients a will be the reciprocals 
lie mass and moments of inertia of the crystal as a whole and the 
Delated coefficients b will be zero. The remaining coordinates and 
menta can evidently be regarded as corresponding to Sn — 6 modes 

■ See Whittaker, AnaZytical Dynamics, second edition, Cambridge, 1917, p. 177. 



§ 137 


MODES OF VIBRATION OF CRYSTALS 


585 


of harmonic vibration, and the internal motions of the crystal can then 
be regarded as a superposition of the vibrations in these modes. 

The expression for the Hamiltonian given by (137.4) is of course not 
quite precise, since it contains the usual approximations involved in 
the method which we are developing. In particular we may mention 
that it neglects small terms in the potential energy which would actually 
be present if we allowed for the interaction between oscillators corre- 
sponding to the possibility of energy transfer from one mode to another. 
Furthermore, when we wish to regard the crystal as in contact with its 
vapour it is not strictly accurate to treat the particles on the surface 
as controlled by pure elastic forces. Nevertheless, the treatment made 
possible by (137.4) will be accurate enough for our purposes, and since 
Zn will actually be a very large number compared with 6, it will also 
be sufficiently precise to regard Zn as the number of modes of harmonic 
vibration by which we represent the internal motions of the crystal. 

The frequencies v of the different modes of vibration will be deter- 
mined from a classical point of view by the values of the corresponding 
constants a and b in accordance with the familiar relation 




271 


(137.5) 


The modes of vibration of the lowest frequencies may be regarded as 
those that would ordinarily be spoken of as mechanical or acoustical, 
and those of higher frequencies as corresponding to motions of the 
particles that would ordinarily be spoken of as thermal. There will be 
an upper limit for the maximum possible frequency since there are 
only 3 tc modes of vibration in all. 

Since the frequencies of the different modes of vibration in the case 
of a large crystal will lie close together over the major portion of the 
spectrum, we can introduce an expression of the general form 

dz=f{v)dv (137.6) 

for the number of modes of vibration in the frequency range v to v+dv. 
To determine the form of the function f(v), it is sufficiently accurate 
for many purposes, following Debye,t to neglect the particle structure 
n-ntl treat the crystal as a continuous, elastic solid. For the number of 
modes of vibration in the range v to v+dv, we should then have an 
expression of the known form 

dz = ^v^dv, (137.7) 


t Debye, Ann. Phyaik, 39, 789 (1912). This important article has already been, 
mentioned as significant for the development of quantum theory. 

3696.125 4 F 



586 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


where v is the volume of the solid and C is a quantity which is deter- 
mined by the elastic properties of the solid which can be taken as 
approximately constant. For example, for an isotropic solid l/C® would 

be given by i i 9 

^ = 4 + 4 , (137.8) 

cf cf 


where and Ci are the velocities of the longitudinal and transverse 
waves of which the solid is capable. An expression of the form (137.7) 
could be expected to give a good representation of the spectral distribu- 
tion for frequencies low enough so that the corresponding wave-lengths 
are large compared with the distances between neighbouring particles. 
The expression must in any case be modified at sufficiently high fre- 
quencies since there are only a finite number of frequencies Zrb in all. 
Taking the expression as holding sufficiently well up to the maximum 
jfrequency we can write 

= ^ = (137.9) 

0 

whicli gives for the maximum frequency the value 



(137.10) 


(6) Application of quantum mecheuiics. We are now ready to turn 
to the application of quantum mechanics to our system by considering 
the allowed eigensolutions of the Schroedmger equation 


Hii(ji ... ... qsn), (137.11) 

where H is the Hamiltonian operator for the crystal and E an allowed 
eigenvalue of its energy. Noting the form of the Hamiltonian operator 


Ji Q 

which would be obtained from (137.4) by the substitution , 

2it% 

we see that this equation can be handled by the method of separation 
of variables by assummg a solution of the form 

- ?8n) = %(?i)« 2 (?a) - ‘•hniiaJ = ]J «<(?<)• (137.12) 


The complete equation (137.11) can then be solved with the help of 
Bn one-dimensional equations each of the form 


where the quantities are the energy eigenvalues for the individual 
modes of vibration. Since each of the individual equations (137.13) is 
of the form already studied for a simple harmonic oscillator in § 72, 



§137 


QUANTUM MECHANICS OE CBYSTALS 


587 


their eigensolutions be of the form previously obtained, and 

the allowed individual eigenvalues of the vibrational energy will he 
given by expressions of the form 


= {h+i)hvi, hi = 0, 1, 2, 3,..., 


(137.14) 


where is the frequency of the »th mode of vibration. The allowed 
eigenvalues for the energy of the system as a whole may then be 


expressed in the form 


E = Ei,+ 2Ei, 


(137.15) 


where is the zero-point energy assigned to the crystal and ^ is 

i 

a sum of possible eigenvalues of energy for the 3» modes of vibration. 

In connexion with the foregoing it is to be noted that none of the 
eigensolutions for the different modes of vibration are to be eliminated 
on the basis of symmetry considerations, since the modes of vibration 
are permanently distinguishable the one from another by their spatial 
location and orientation inside the crystal. This is in agreement with 
the fact that the coordinates q^... q^ for these modes of vibration are 
functions of the original coordinates x-^y-^z-i ...x^y^z^ which were as- 
signed to the different particles on the basis of their association with 
the permanently distinguishable equilibrium positions in the crystal. 

(c) Sum-over-states for the crystal. We have now obtained all that 
is needed for a calculation of the sum-over-states for the crystal, 

Z = (137.16) 

m 


where the summation is over all possible energy eigenstates m, each 
individual state to be separately included in the case of degenerate 
energy levels. Noting the expression for the total energy of the crystal 
given by (137.15), and allowing for the possibility that the lowest 
energy level may be degenerate and actually consist of Gq individual 
states, we can then rewrite (137.16) m the form 

Z = ® . (137.17) 

where the summation 2 is to be taken over aU combinations of the 
possible eigenvalues E^, E^, E^,..., E^^ of the 3% modes of oscillation. 
Introducing the expression for the possible eigenvalues for these energies 
given by (137.14), we obtain 


2 _ Q y e-(fci+j)ftv,/fc 2 ' y Q-(jc,+m>jkT y g-o 

“ kf^Q 

_ 0 e-W \j \ \ 


(137.18) 



588 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


where v^, vg •'an denote the frequencies of the Zn modes of vibration, 

and the second form of writing can be verified by perfo rming the 
indicated divisions. 

For the purposes of calculating thermodynamic quantities we shall 
be interested in the logarithm of the above quantity, which can be 
written in the form 

jp 571 Jt 

logZ= 2 log(l-e-'^*"'«’)- 2 (137.19) 

i=l i-1 

where the summations are over all 3n modes of vibration i. Substituting 
the expression (137.7) for the number of modes of vibration in the range 

V to v+dv, we can then replace these summations by integrations from 

V = 0 to the maximum firequency v = v^, and write 

logz- 

” (137.20) 

where the next to the last term is obtamed by evaluating the integral 
which depends on the residual ‘half-quanta’ ^hv which our modes of 
vibration possess in their lowest quantum states. 

Finally, it will be profitable to re-express this result by introducing 
a dimensionless variable x, which is defined by 

x-%, (1S7.2I) 


together with a quantity @, determined by the elastic properties of the 
crystal, which is defined by 



(137.22) 


where the second form of writing results from (137.10). This quantity 
may be called the characteristic temperature for the crystal. Introducing 
X and @ into (137.20), we now obtain the desired result 


BIT 

logz = J iB21og(l-c-®) dx - |»|^j-l-logGo 

(137.23) 

as a general expression for the sum-over-states Z of such a crystal. 

We shall also be specially interested in the form of this expression 
at very low and at very high temperatures where the evaluation of the 



§137 


SUM-OVER-STATES FOR CRYSTALS 


589 


integral ■which it contains is simple. At very low temperatures, T 
vre can evidently take 

QJT 00 

j ®®log(l— = J a;®log(l— e-®) dx 



(137.24) 


where the second form of ■writing is obtained with the help of a partial 
integration. This then gives us 




At very high temperatures, T’^Q, only very small values of x, between 
0 and @IT, ■will be involved in the integrand appearing in (137.23), and 
we can substitute 


log(l— e“®) = logo:. 


(137.26) 


Doing so, we have 
obtain the result 

log.Z = 


an integral of the simple form Ja;^loga;dx, and 
-f 3«log(|)+«-^|j+log (?„. (137.27) 


(d) Thermodynamic properties of the crystal. With the help of the 
foregoing expressions for log.Z, we can now investigate the ■thermo- 
dynamic properties of the crystal by making use of our general equa- 
tions (133.3) to (133.6) for pressure, energy, heat capacity, and entropy. 

For the pressure we may ■write at once from (137.23), for any tem- 
perature, the general expression 


p = kT 


BlogZ 

8v 




where, ■without attempting any more complete analysis, stress may be 
laid on the specially important term —dE^l&D. 

In the case of the remaining quantities we may begin by considering 
the low-temperature and high-temperature results separately. 

At very low temp&atures we must make use of (137.25). Treating @, 



690 FUBTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


which depends on the elastic properties of the crystal, as independent 
of the temperature, we then obtain for the en&rgy of the crystal 


E = hT^ 


dlogZ 

8T 


= -^0'+ 


Zv^nkT* 
5 ©8 


■hinW, 


(137.29) 


where the first term is the zero-point energy, the second the energy of 
excitation of the modes of oscillation, and the last the residual energy 
in these modes due to their ‘half-quanta’ of energy when unexcited. 
For the Aeaf capacity at constant volume we obtain, by differentiation. 


0 ,= 


~ 03 ’ 


(137.30) 


thus giving the third power dependence on temperature first found by 
Debye. Fmally, for the entropy we obtain 

8 = ^+h\ogZ = ^nki^+h\ogQ^, (137.31) 

where we shall give later consideration to the important term Is log 
which depends on the degeneracy of the ground-level. We may regard 
these results with great confidence, since only the modes of vibration 
of low frequency and hence of wave-length long compared with the 
distances between neighbouring particles will be appreciably excited at 
low temperatures, and these modes may well be taken as distributed 
in accordance with the assumed expression (137.7). 

At very high tempesraimes we must make use of the expression for 
log^ given by (137.27). Treating 0 again as independent of T, we 
obtain for the energy of the crystal 

E = = Eo+5nkT+^. (137.32) 


By differentiation we then obtain for the Tieat capacity at constant 

C; = Znk, (137.33) 


in agreement with the original high-temperature expression of Dulong 
and Petit. Finally, for the enirropy we obtain 


B = ^-{-klogZ = 4sn,k-\-^k\og^-\-k\og Oq, (137.34) 
J. (y 


where the term fclog again appears. In so far as these high-tempera- 
ture results depend merely on the total number of modes of oscillation, 
and not on the assumed distribution of frequencies which has led to 
terms containing 0, we may r^ard them as quite precise. 



§ 137 THERMODYNAMIC PROPERTIES OP CRYSTALS 691 

At intermediaie temperatures, between the very low and high ones 
treated above, we must make use of the full expression for log Z given 
by (137.23). Without presenting the details of the calculation, we then 
find as the expression for energy 

BIT 

^ J ^ (137.36) 

0 

and as the expression for entropy 

BIT 

8 = f dx — 3TOl;log(l— e~®''^)+ifclog(3'o, (137.36) 

0 

where use can now be made of existing tablesf for the unevaluated 
iutegral — ^the so-called Debye function — which still remains. The de- 
pendence of energy on temperature given by (137.36) is actually found 
to follow the observations within the limit of error in the case of many 
crystals, provided © is chosen so as to secure the best fit, and some 
measure of agreement has also been found between such values of 0 
and those calculated firom the appropriate elastic constants. 

It should of course be remarked, in connexion with the above results 
for crystals at high and intermediate temperatures, that a complete 
treatment would also have to give recognition to the possibility of 
exciting internal states of the component molecules above the ground- 
level when the temperature is sufficiently raised. 

(e) Remarks on Ihe entropy of crystals. The foregoing investigation 
is of special importance because of its beating on the values which 
should be aiscribed to the entropies of actual crystals as we approach 
the absolute zero of temperature. At low temperatures we have ob- 
tained the expression /L_A«r.OI8 

for the entropy of our simple model of a crystal. As the temperature T 
goes to lower and lower values, this approaches the value 

= (137.38) 

where the quantity 0^ is the quantum weight or number of independent 
quantum states which actually correspond to the state of the crystal 
when its modes of oscfilation have not been excited. 

In the case of a crystal uniquely constructed from a single kind of 

t See, for example, six-place tables by Beattie, Joum. MaOi. and Phya. (Mass. 
Inst. Tech.) 6, 1 (1926). 



592 FUBTHEB APPLICATIONS TO THEBMODYNAMICS Chap. XIV 


molecule, which itself exhibits no degeneracy in the ground state, this 
quantum weight Gq could be taken as unity since the crystal would 
approach a single definite structure as we make the temperatizre lower 
and lower. It would then be consistent to take the entropy of the 
crystal as approaching at low temperatures the value 

== 0 , ( 137 . 39 ) 

in agreement with the formulation originally given to the so-called 
third law of thermodyrumms by Planck. 

In the case of many crystals, however, composed of a chemically 
pure substance, it is now quite certain that we cannot regard the 
crystal as approaching a condition corresponding to a single quantum 
mechanical state when we extrapolate from the lowest temperatures 
available in the laboratory. In such cases we shall have to take 
greater than unity, and hence also take 

> 0 , ( 137 . 40 ) 

if we desire a starting-point for entropy, consistent with the assignment 
of zero entropy to any pure state, as is used when the same substance 
is considered in the gaseous or other non-crystalline form, or as entering 
into chemical reaction. The presence of multiplicity in the grotmd-level 
approached by pure crystalline substances at low temperatures will be 
due to one or more of the following factors — existence of isotopes, 
presence of allotropio forms, randomness in the crystal structure, or 
unresolved degeneracy in the component molecules. 

First of all we may consider the possibility that an elementary sub- 
stance which we regard as pure from the ordinary chemical point of 
view is in reality a mvxiwre. of isotopes. To give a strict treatment to 
crystals of such substances, suitable for use when possible variations in 
isotopic composition are to be considered, it will then be necessary to 
take the quantum weight 0^ for the low-temperature state so as to 
correspond to the different ways in which the isotopic atoms could be 
arranged in the crystal lattice, since these different arrangements would 
correspond to maoroscopically indistinguishable crystals but to micro- 
scopically distinct quantum states. The entropy of such a crystal at 
low temperatures would then also have to be assigned a value greater 
than zero, as can be evaluated by the methods to be developed in the 
next section where we consider mixtures m general. Nevertheless, in 
giving a treatment to mixtures of isotopes suitable merely for applica- 
tion to ordinary physical-chemical processes where changes in isotopic 
composition do not occur, it proves possible and convenient to intro- 



§137 


ENTROPY OF CRYSTALS 


593 


duoe small approximations which then make it feasible to specify the 
entropies of crystals in a manner which ignores the existence of separate 
isotopes, t 

As our second case, we may consider the possibility that an ele- 
mentary substance can contaia two allotropio for?ns of molecule, for 
example ortho- and para-hydrogen, which in the absence of a catalyst 
cannot change readily from the one form into the other. Under such 
circumstances the crystal on which we actually make low-temperature 
measurements will ordinarily be a mixture of the two kinds of molecule 
in their high-temperature ratio, instead of being composed of the single 
form that would be thermodynamically stable at the absolute zero. It 
will then again be necessary to assign a weight greater than unity 
and an entropy greater than zero to the actual low-temperature 
crystal, as can be calculated by the methods for treating mixtures which 
will be developed in the next section. 

As our third case, we now come to the possibility of ravdomneaa in 
the crystal structure. This wiU arise when the atoms or molecules com- 
posing the crystal can actually be arranged in a variety of different 
ways without important disturbance in the main features of the crystal 
lattice. A simple example is provided by carbon monoxide in which the 
two atoms of the CO molecule are so similar that crystals apparently 
form without much regularity of orientation with respect to the two 
different ends of the molecule.^ Assuming complete randomness in such 
a respect for a crystal of n molecules, we should evidently have 

= nh\og2 or 1'38 calories/(mol degree) (137.41) 

as the contribution to low-temperature entropy. The actual measure- 
ments of the entropy differences between crystalline and gaseous carbon 
monoxide made by Clayton and Giauque gave a value 1-1 cal. per mol 
per degree for the entropy which had to be assigned to the crystal on 
account of such randomness, thus indicating some tendency towards 
regularity of arrangement. Another good example is provided by ice, 
where, in accordance with the considerations of Pauling,] ) there appear 
to be several nearly equivalent arrangements for the hydrogen atoms 
in the oxygen lattice. Here the calculated and observed entropy due 
to the possibilities of random arrangement are respectively 0*805 and 
0*82 calories per mol per d^ee, which is a very close agreement. 

As our final category, we now come to the possibility of unreserved 

t See Giauque and Overstreet, Joum, Amer. Ohem, Soc, 54, 1731 (1932J. 

i See CSlayton and Giauque, ibid. 64, 2610 (1932). 

II Pauling, ibid. 57, 2680 (1936); Giauque and Stout, ibid. 58, 1144 (1936). 

3595.25 4 Q 



S94 FURTHEB APPLICATIONS TO THERMODYNAMICS Chap. XIV 


degeneracy in the component molecules, which provide the shnilax particles 
out of which the crystal lattice is really built. Here, if we have a crystal 
composed of n molecules, which — ^in our actual low-temperature 
measurements — ^are themselves m a level corresponding to separate 
states, it is evident that we must take 

<^0 = ^2'=o = nhlogg^, (137.42) 

as giving the contribution to the low-temperature entropy of the crystal 
arising from that cause. Such unresolved degeneracy will occur when- 
ever the component molecules have one or more eigenstates with energy 
so close to that of the very lowest that they can all be regarded as 
equally excited at the low temperatures actually achieved in the labora- 
tory measurements. The examples of such a situation are of several 
kinds. In llie case of crystals where the molecules can still rotate at 
the lowest temperatures, we can have — ^as in the example of ortho- 
hydrogen — a degeneracy of the lowest rotational state.f In the case of 
crystals where the molecules presumably cannot rotate, we can have — 
as in the example of Gd 2 (S 04)3 • SHgO — degeneracy involving electronic 
spin and orbital momentum.^ linaUy, in the case of crystals in general, 
we can have possibilities for different orientations of nuclear spin when 
this is present. In this latter case, however, it is often possible to ignore 
the presence of nuclear ^in since its effects may cancel when entropies in- 
volving the same element under different circumstances are subtracted.|| 

It is thus evident from the foregoing that we cannot neglect the 
term fclog (?o in the low-temperature entropy of crystals, if we wish to 
measure their entropies from the same starting-point, = 0 for a pure 
state, that we find simplest in general. Indeed, we must often expect 

^ > 0 (137.43) 

if we extrapolate down from actual low-temperature measurements on 
the crystal. Furthermore, we must now expect to find cases where 
chemical reactions between eryst^ would be accompanied by a change 
in entropy as we approach T = 0, 

0, (137.44) 

which cannot be taken equal to zero as demanded by older formula- 
tions of the so-called third law of^ thermodynamics. A list containing 
a number of known cases, where the change in entropy for such low- 

t See Giauque and Johnston, Joum. Amer, Ohem. Soc. 50, 3221 (1928); Pauling, 
Phya, Pev. 36, 430 (1930). 

t See Giauque, Joum. Amer, Okem. Soe, 49, 1870 (1927). 

11 See Gibson and Heiiler, Zeits.f. Phya, 49, 466 (1928). 



ENTROPY OP CRYSTALS 


59o 


§ 137 

temperature crystalline reactions would not be equal to zero, will be 
found in Fowler’s StcUistical Mechanica.'\ 

It is often convenient to write the expression for the residual low- 
temperature entropy of a crystal in general in the form 

fi'o = nhlogg^, (137.46) 

where in simple cases will merely be the multiplicity of the ground- 
level of the component molecules. Discussion of the values of the 
quantity g^ for a number of specific types of crystal will also be found 
in Fowler’s treatise. J 


138. Mixtures of substances 


So far we have been interested in the thermodynamic properties of 
gases and crystals composed of a single pure substance, and have merely 
referred to mixtures in connexion with the possibilities for isotopic or 
allotropic forms of the same chemical substance. We must now con- 
sider the thermodynamic properties of mixtures of different substances. 
We shall give separate treatments to the limiting cases of mixtures 
in a highly rarefied gas phase where the molecules have an almost 
negligible action on each other, and mixtures in a highly condensed 
crystalline or liquid phase where the interactions between molecules 
become of prime importance. 

(a) Gaseous mixtures. Let us consider a gaseous mixture composed 
of n molecules of one kind and m of another kind enclosed in a con- 
tainer of volume v. And let us take the dilution and temperature high 
enough, so that the effect of collisional interaction will not be sufficient 
to introduce appreciable change in the energy levels of the molecules, 
and so that the phenomenon of gas degeneration will not be appreciable. 
To treat the properties of such a mixture we must first obtain an 
expression for the corresponding sum-over-states, 

Z = (138.1) 

it 

where the summation is over all energy eigenstates i for the system 
as a whole. 


Tollowing the same line of argument, developed in more detail in 
§ 134 for systems containing only a siogle kind of molecule, we may 


then write 




(138.2) 


as the expression for a possible energy of the ^stem as a whole, where 


t Fowler, loc. cit., p. 233. 

j Fowler, loc. oit., pp. 218 ff. For the coimexioii between Fowler’s notation and ours, 
see footnote f on p. 608. 



596 FUBTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


cj ... aie n possible eigenvalues of energy for a molecule of the first 
Mud. and ... are m possible eigenvalues for a molecule of the second 
kind. Considering the circumstance that a new state of the system as 
a whole would not be obtained by a mere permutation in which the 
same states of energies e or were assigned different particle indices 
1 ,..., 71 or 1 ,..., 7w, we may then write for the sum-over-states 


- 2 2 , 


Tlj,! Tlj! ... 


€'xt... + €'h+^+. 




n\ 


m\ 


(138.3) 


where the symbols tIj., and 7re„ mg,... denote the numbers of repeti- 
tions of the same states of energies c or c' respectively. In the absence 
of gas degeneration, where these numbers may be regarded as going to 
unity for most states of the system that are important, we then obtain 






(138.4) 


as the desired expression of the sum-over-states for a dilute, gaseous 
mixture of n molecules of one kind with m of another, where the sum- 
mations are over aU individual eigenstates h and r that would be 
exhibited by a siogle molecule of the two kinds respectively in a con- 
tainer of the total volume v. 

By comparing this expression for the sum-over-states of the mixture 
with the expression given by (136.1) on which we based the treatment 
of a single gas, we now obtain the important conclusion that the sum- 
over-states for our mixture of two gases would be equal to the product 
of the values of that quantity that would hold separately for each of 
the two component gases, provided we had each gas alone in the same 
volume V as that of the mixture in order that its molecules should 
exhibit the same spectrum of eigenstates or in the pure and mixed 
conditions, and at the same temp&roMre T as that of the mixture in 
order to secure unchanged exponential factors. This result can be 
expressed in the form 

Z{n-\-m, V, T) = Z{n, v, T)Z{m, v, T), (138.6) 

where the arguments of the quantities Z indicate the system for which 
the sum-over-^tates is to be considered. 

It will now be convenient to take logarithms, since logZ is the 
quantity immediately appearing in our formulae for the evaluation of 
thermodynamic quantities. We can then write for a sufficiently dilute 
mixture of two gases 

1<^ Z{n-^m, V, T) = log Z{n, v, T)-l-log Z{m, v, T), (138.6) 



§138 


GASEOUS MIXTURES 


S97 


and this result can evidently be immediately extended to any number 
of components. Turning now to our general formulae (133.3) to (133.6) 
for the calculation of the quantities of thermodynamic interest — ^pressure 
p, energy E, heat capacity at constant volume C^, and entropy S — ^we 
immediately see that we shall have in general for a sufficiently dilute 
mixture of gases the additive relations 

p{n+m, V, T) = p{n, v, v, T), 

E{n+m, V, T) = E{n, v, T)+E{m, v, T), 

V, T) = 0„{n, v, T) + C^{m, v, T), 

T) = S{n,v, T)+S(m,v, T), 

in agreement with well-known facts as to the properties of perfect gases 
and their mixtures. The quantities on the left-hand side of the above 
equations designate properties of the mixture of n+m molecules of the 
two gases, and those on the right-hand side designate properties of 
the separate gases when present alone at the same vditme and tempera- 
ture as that of the whole mixture. 

For some purposes it is preferable to express the thermodynamic 
properties of the mixture in terms of the properties of the separate 
gases when present alone at the same pressure and temperature as the 
mixture, rather than at the same volume and temperature. To investi- 
gate this we may now introduce the indices 1, 2, and 12 to designate 
the pure gases and their mixture respectively under any conditions of 
interest. Making use of the simple dependence of Z on v for a single 
gas as given by (136.5), we may then rewrite our expression (138.6) for 
the sum-over-states of the mixture in the somewhat generalized form 
log Wi 2 , T) 

= logZi{ni,Vj^, T)-l-Wilog— +log^2(»2. »2. T)-f-n3log^, (138.8) 

where is the volume of the mixture, % and Vj are any volumes that 
we wish to choose for the separate gases, and T is the temperature 
for each of the three systems under consideration. By choosing the 
volumes and v^, in accordance with the expressions 

»i2_%+«2 ami 2^ ^ %+W2^ (138.9) 

we shall then have our separate gases present at the same pressures 
as that of the mixture, and can rewrite (138.8) in the desired form 

log Zi 2 = log 2?i-i-log ^ 2 +”i log ^2 log^^^j^ , (138.10) 


(138.9) 



598 FUETHEK APPLICATIONS TO THERMODYNAMICS Chap. XIV 


where Z^, Z^, and Z-^^ now refer to the separate gases and to the mixture 
all at the same pressure and temperature. 

With the help of our general formulae (133.3) to (133.6) for the 
evaluation of thermodynamic quantities, we can now write 


=Pl = Pit ^12 = 

^2 = ^i+>Sa+ % fe log + TOg fc log , 


(138.11) 


where the subscripts 12 and 1 and 2 now refer respectively to the 
mixture of gases at a given pressure and temperature, and to the pure 
gases alone at that same pressure and temperature. It may be remarked 
that the last two terms in the expression for entropy, which give the 
so-called mtropy of mixing, arise from the circumstance that the formula 
for the evaluation of entropy contains a term depending directly on 
log Z as well as on its derivatives. 

(6) Mixed crystals. We may now turn from gaseous to crystalline 
mixtures and treat the ideal limiting cam, which would be approached 
when the molecules of the two kinds of substance become so nearly 
alike as to the volume they occupy and the forces they exert on their 
neighbours, that they can be used indiscriminately for constructing 
crystals of any desired composition without changing the structure or 
dimensions of the crystal lattice or affecting the elastic properties of 
the crystal. Assuming these limiting ideal conditions, let us then con- 
sider — at a given pressure and temperature T — ^the two pure crystals 
consisting of % molecules of the fust kind and of the second kind, 
and the mixed crystal consisting of n^-\-n^ molecules of the two kinds 
taken together. In accordance with the above assumptions, the volumes 
of the three crystals would then be additive, with 

»i+®2=«i2- (138.12) 

And, in accordance with the general treatment which. we have given 
to crystals in § 137 — see (137.22) and (137.23) — ^it is evident that the 
logarithm of the sum-over-states for each of the three crystals would 
be given by an expression of the same form 

logZ = _ J a;21og(l-e-*) dx- |n^.^j-l-log G^, (138.13) 

0 

where jE(, is the energy assigned to the crystal in the absence of vibra- 
tional excitation, is the number of actually independent quantum 
states that correspond to this condition, and 0 is a parameter which 



§ 138 


MIXED CBYSTALS 


599 


has the same value for all three of our idealized crystals since it depends 
solely on elastic properties and lattice dimensions in accordance with 



(138.14) 


As a consequence of the above generally applicable form (138.13) 
and of the additivity of the values of Eq in the case of our idealized 
crystals, it is then evident that we could relate the sum-over-states 
for the mixed crystal to the corresponding quantities for the two pure 
crystals by ^ 

log ^13 = log Zi+log ^ 2 +log , (138. 16) 


’G'lG'a’ 


where Gja, Gi, and 0^ denote the numbers of independent quantum 
states for the three crystals when they are vibrationally xmexcited. 
Furthermore, in typical cases we can evidently expect to differ 
from the product Gj ohly because we must take each different mode 
of arranging the two Mnds of molecule in the lattice for the mixed 
crystal as corresponding, from a microscopic point of view, to a different 
quantum-mechanical state. We can then put 


(1 (%”t“^2) ! Q ff 




(138.16) 


where the factor expresses the number of such modes of arrangement. 
Using the Stirling approximation for factorials, omitting negligible 
terms when % and are large, we then have as our desired expression 
for the sum-over-states of the mixed crystal 

log ^12 = log ^i-flog Z^+n^ log ”^'^”^ -f TOg log ”^*^”^ . (138.17) 

% % 

Making use of our general formulae (133.3) to (133.6) for evaluating 
quantities of thermodynamic interest, we then also have for the pres- 
sure p, energy E, heat capacity at constant volume C„, and entropy S 
of our mixed crystal in terms of those quantities for the two pure 
components 

, . (138.18) 

^2 = k log ”^^”^ -j- ^ log ?? .± . 

These expressions are of just the same form as the analogous ones for 
mixtures of perfect gases, when the properties of the pure gases are 
related to volumes and such that the pressures will be the same 
as for the mixture as a whole. 



600 FITRTHEE APPLICATIONS TO THEEMODYNAMICS Chap. XIV 


The above treatment of mixed crystals is of course a highly idealized 
one since it applies under the limiting conditions which we have assumed 
as to negligible effects from the replacement of one kind of molecule 
by another in the crystal structure. Nevertheless, it provides a good 
starting-point from which deviations may be considered. The ideal 
conditions would be very nearly approached in the case of mixtmres 
of isotopes. The corresponding treatment, which was given to gaseous 
mixtures, has a wider validity since it should apply in general when 
the gases are sufBeiently dilute. 

(c) Liquid mixtures. Some progress along the lines of the above 
methods can also be made in understanding the properties of liquid 
solutions. For this purpose we may consider the ideal limiting case of 
two liquids, which axe composed of molecules exerting sufficiently 
similar forces so that there would be no change in total volume on 
mixing at a given pressure p and temperature T, and so that a molecule 
in the mixture would have the same forces exerted on it by its neigh- 
bours as in the pure liquid form. 

For the volume of a mixture of n^ and molecules of the two 
kinds we could then write 

®i2 = (138.19) 

where and are the volumes of the pure component liquids at the 
given pressure and temperature. 

Furthermore, in accordance with our assumptions, we could regard 
the energy spectra of the individual molecules of the two kinds as 
unaltered on making the mixture, since in the case of a condensed liquid 
phase the energy levels of a molecule could depend on environment but 
only very slightly on total volume. Following the methods used in 
§134 for calculating the sum-over-states as dependent on molecular 
states, assuming temperatures sufficient so that symmetry restrictions 
can be n^lected, we should then be led to take 


^1 = 
2^2 = 



(138.20) 


■^12 


(aiH-Wg)! 

nj\n^. 




where and Te^ denote the energy eigenstates for a single molecule of 



LIQUID MIXTUBES 


601 


§ 138 


the two kinds, as giving fair first approximations for the sum-over- 
states of the two pure liquids and of the mixture. 

The factor, by which is multiplied in the third of these expres- 
sions can be regarded as roughly Justified if we assume the approximate 
validity of treating our mixture as corresponding to permanent spatial 
arrangements of the two kinds of molecules, which could be trans- 
formed into each other by the interchange of unlike molecules. The 
factor would then give the number of different quantum mechanical 
states for the system as a whole that would result from such inter- 
changes. 

This approximate picture, by neglecting the effect of total volume 
on molecular energy levels, and neglecting the interchanges of unlike 
molecules which could occur by diffusion, provides a rough treatment 
for mixed liquids similar to the treatment already given to mixed 
crystals rather than to that given to mixed gases. For a more correct 
picture, modifications in the direction of gas-like behaviour might be 
introduced. Nevertheless, since the two limiting cases of permanent 
crystals and dilute gases have both led to the same dependence of Z^^ 
on and Zo which we have assumed above, we may regard this 
expression as fairly satisfactory. 

Our approximate picture has also given no consideration to the effect 
of the relative volumes of the two kinds of molecules in the mixed 
liquid on the number of interchanges that would be possible. If these 
volumes were equal, the factor uuder discussion would 

seem correct. On the other hand, it might stiU be a good first approxi- 
mation in a variety of cases. For example, as suggested by a conversa- 
tion with Professor J. T. Hildebrand, a mixture of two kinds of straight 
chain organic molecules, differing greatly in length but having similar 
chemical character along the chain and at the ends, might provide the 
conditions necessary for the approximate validity of the above equa- 
tions. If such molecules were arranged end to end in filaments which 
were then packed parallel in the liquid, each molecule in the mixture 
would find itself in a similar environment as in the pure Kquid, and the 
total number of arrangements would be given by the above factor. 
The actual structure of the Kquid mixture might tend to approximate 
these conditions. 

Assuming the vaKdity of (138.20) as a good first approximation, we 
then obtain 

loggia = (138.21) 

4H 


8695.25 



602 FUBTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


aoid are led to thermodynamic expressions similar to those in the case 
of perfect gases and idealized crystals. 


PlSt=Px= -^12 = A+A, 


(138.22) 


for the values of the pressure, energy, heat capacity, and entropy of 
the mixture in terms of those same quantities for the pure liquids. 

(d) On the definition of the ideal solution. In view of the similar 
character of the results which we have obtained in all three above 
cases of idealized gaseous, crystalline, and liquid mixtures, we are now 
in a x> 08 ition to propose a general statistical-mechanical definition of 
the ideal or perfect scdviion, which can be regarded as a standard of 
reference in treating the properties of actual solutions. Using the in- 
dices 1, 2, and 12 to denote in general the two pure phases consisting 
of and «.2 molecules of the two kinds and the similar mixed phase 
made by their combination, and taking conditions such that the pres- 
sures and temperatures satisfy 


Pl2 = Pl=P2> 


T^ = T„ 


(138.23) 


we may define the ideal solution as having a volume which satisfies the 

additive relation „ j_ noQo-is 

%2 == (138.24) 

and as having a sum-over-states which satisfies the relation 

log.Zi 2 = log Z^+log log -[- TOa log . (138.26) 

In accordance with our general formulae (133.3) to (133.6) for the 
evaluation of thermodynamic quantities, we can then ailso show that 
the energy, heat capacity at constant volume, and entropy of our ideal 
solution would be given by 

•®12 = Aa = 


As, = ^+^a+%*log 




-f»2*log 


«l+»2 


(138.26) 


The foregoing relations may be readily extended to any number of 
components. 

It win be of interest to compare the above statistical-mechanical 
definition of the ideal solution with the thermodynamic definition pro- 
posed by Lewis,'!' "w^hich states that the fugacity of each component of 
the solution would obey the generalized Baoult law 

t Lewis, Joum. Amer. Ghem. Soe. 30, 668 (1908). 


(138.27) 



§138 


IDEAL SOLUTION* 


603 


where / and /® are the fugacities of the component in the mixture and 
in the pure state respectively, and x is its molal fraction in the mixture. 

To show that this relation would also be satisfied if we take the 
proposed statistical-mechanical definition of a perfect solution, it will 
first be necessary to recall the following definitions of thermodynamic 
quantities. For the tTiermodynamic potential F of a system at a given 
pressure p and temperature T, -we have 

F = E+pv-TS. (138.28) 

For the partial molal potential F of any component of the system — ^say 
the component 1 — ^we then have 


-^1 = 



(138.29) 


where the number of mols of the different substances composing the 
system are denoted by N^, etc. And for the fugacityf of that com- 
ponent we then have by definition the relation 

BTlogfi = Fj-f const. (138.30) 


Turning now, however, to the proposed statistical-mechanical defini- 
tion of a perfect solution and its thermodynamic consequences given 
by (138.26), we can evidently write for the thermodynamic potential 
Fi 2 of a binary solution 

■^12 = Fj_+F^-Nj_ItTlog^^l^-N^BTlog^^^^, (138.31) 

-"1 Af2 

where F^ and F^ are the potentials of the two components in pure form. 
Differentiating with respect to we then obtain 


Fj = F5-FFlog:^^±^ 

iVl 




BT+^BT 

-^1 




BT 


(138.32) 


as a relation which connects the value of partial molal potential Fj for 
the indicated component in solution with its value FJ in the pure state. 
And substituting (138.30), we then obtain 


Ni+n/° 


(138.33) 


for the fugacities /i and/J of that component in the solution and pure 
state respectively, and relations of the same form could evidently be 
derived for each constituent in the case of a perfect solution composed 
of any number of components. 



604 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 

We thus see that our statistical-mechanical definition of a perfect or 
ideal solution will lead to the same results as the earlier thermodynamic 
definition. The statistical-mechanical viewpoint may have some ad- 
vantages, however, in maldng it easier to see under what conditions we 
may expect the properties of actual solutions to agree with or deviate 
firom those of the ideal solution. 

139. Vapour pressures and chemical equilibria 

(a) The thermodynamic potentials of crystals and of gases. We are 
now ready to use some of the results which have been obtained in this 
chapter to illustrate the methods of treating physical-chemical equi- 
libria. We shall give brief consideration first to equilibria between 
crystals and their vapours, and then to chemical equilibria between 
reacting gases. 

In the thermodynamic treatment of physical-chemical equilibria it 
is often most convenient to consider the therrnodyncmic potential F of 
the system of interest.f This quantity may be defined in terms of the 
eneigy E, pressure p, volume v, temperature T, and entropy S of 
the system under consideration by the equation 

F=E+pv—T8. (139.1) 

And the condition for thermodynamic equilibrium at a given pressure 
and temperature will then be given by the expression 

BF=0 (139.2) 

foi any variation in the condition of the system, holding the pressure 
and temperature constant. The above relations (139.1) and (139.2) are 
themselves purely thermodynamic. Nevertheless, if we substitute for 
the quantities composiDg F values obtained from statistical-mechanical 
theory, rather than the empirical values which would have to be used 
in a purely thermodynamic treatment, we shall reap the full reward of 
our more powerful statistical-mechanical methods. 

For the purposes of the present section we shall be interested in the 
thermod 3 mamic potentials of crystals at temperatures low enough so 
that we can apply the low-temperature formulae for energy and entropy 
obtained in § 137 (d). We can use those formulae with great confidence 
since their derivation is relatively independent of the assumed form of 
distribution for the frequencies of modes of vibration, except in the 
region of wave-lengths — ^long compared with intermolecular distances — 

t This quantity is sometimes called ‘free energy’ by chemists, although it differs by 
the term pv from the quantity originally given that name by Helmholtz. 



CKYSTALS AND GASES 


605 


§ 139 

where the form should be valid. Neglecting in the expression for F 
given by (139.1) the small term pv, which would be inappreciable for 
a condensed phase at the pressures that will interest us, we can then 
write — ^in accordance with our previous expressions (137.29) and 
(137.31) for the low-temperature energy and entropy of a crystal — 

F = Fo+lnL-e~nk^^-kTlog Gg (139.3) 

as an expression for the thermodynamic potential of a crystal of n 
molecules at low temperature. Or putting 

n€^ = Eo+lnk® (139.4) 

as a convenient expression for the total energy of the unexcited crystal 
including its residual half-quanta of vibrational energy, and introducing 
in accordance with (137.45) the more convenient expression, 

fclog Gq = nkloggc, (139.5) 

for the effect of the multiplicity of the ground state of the crystal, we 
can write as the desired expression for the thermodynamic potential of 
a crystal at low temperatures 

F = ne^—'^nk— —nkTlogg^, (139.6) 

where the characteristic temperature 0,. which depends on the elastic 
properties of the crystal, can be safely r^arded as a constant at the 
temperatures and pressures that will interest us. 

Turning next to monatomic gases, we shall be interested in tempera- 
tures high enough so that gas degeneration can be neglected. Setting 
pv = nhT in (139.1), we can then write, in accordance with the expres- 
sion connecting entropy and energy given by (136.18), for the thermo- 
dyncmic potential of a monatomic gas, consisting of n molecules at 
pressure p and temperature T, 

F = neo—^kT log T -\-nkT log p—nkT log V — e — 

-nkT log ^^^*^^\ (139.7) 

flf 

where m is the mass of a single atom, and are the energy and the 
quantum weight of the ground state of the atom, and the term involving 
summation is over aU appreciably excited electronic states i. 

Turning finally to diatomic gases, we shall be interested in tempera- 
tures high enough so that gas degeneration can be neglected, and so 
that the rotational energy of the molecules can be regarded as fully 



606 FURTHBB APPLICATIONS TO THERMODYNAMICS Chap. XIV 

excited. Again setting pv = nhT in (139.1), we can then write, in 
accordance with the expression connecting entropy and energy given 
by (136.31), for the thermodyTiamic potentUd of a diatomic gas 

F = wcfl— irahTlog T+nkTlogp—nhTlog 2 

v,e 

-nkT\og{27fm)m^^ (139.8) 

where the newly appearing quantities I and o- are the moment of inertia 
and symmetry number of the molecule, and the term involving summa- 
tion is now over all appreciably excited vibrational and electronic 
states ve. 

(b) Vapour pressures of crystals. We can now use the foregoing 
expressions for thermodynamic potentials to obtain formulae for the 
vapour pressures of crystals, over a temperature range where the 
crystals can be treated by low-temperature methods and their vapours 
can be regarded as perfect gsises with translational and rotational 
energy fully excited. Such a range exists for all substances except 
hydrogen, where the full excitation of rotational levels only occurs at 
somewhat high temperatures owing to the low moment of inertia of 
the molecule. 

Noting the condition for equilibrium given by (139.2), it is evident 
th^t a crystal and its vapour would be in equilibrium when we have 
pressures and temperatures such that the thermodynamic potentials 
would be equal for the same amount of substance in the cr 3 nstalline and 
in the gaseous form. Hence, by equating (139.6) first with (139.7) 
and then with (139.8), and solving forp, we can obtain expressions for 
the equilibrium or vapour pressure in the respective cases where the 
substance is monatomic or diatomic in its gaseous phase. Doing so, we 
then obtain, after some rearrangement, for the vapour pressure of a 
crystal which gives a mcmatomie gas 

log? = 

(139.9) 

and obtain for the vapour ^essure of a crystal which gives a diaUmic gas 

logp = -^^4ilog r-hlog 2 

-l-log(2«ire)»l:»^A, (139,10) 



§ 139 


VAPOUR PRESSURES 


607 


where the significance of the various quantities involved has already 
been noted. 

These formulae apply of course in the temperature range which was 
specified above. With the help of the well-known purely thermo- 
dynamic equation of Clapeyron for the change in vapour pressure with 
temperature, however, we can of course calculate the vapour pressure 
at any desired temperature if we know it at some one temperature. 
This equation for the dependence of vapour pressure on temperature 
may be written in the closely approximate form 


dlogp _ A 
dT ^ 


(139.11) 


where A is the heat of evaporation per mol at the temperature in 
question. As an expression for this heat of evaporation as a function 
of temperature we can write 

T 

A = Ao+ J dT, (139.12) 

0 

where Aq is the heat of evaporation approached at very low tempera- 
tures, and AC^ is the difference in heat capacities at constant pressure 
between a mol of the substance in gaseous and condensed form. 
Furthermore, we can also put 


= (139.13) 

where the first term gives the constant part of this difference for the 
two eases of monatomic and diatomic gases respectively, and the second 
term gives the part which varies with temperature. This then gives us 

T 

A = Ao+ |i2r+ J AOi dT. (139.14) 

0 


Substituting (139.14) in (139.11), and integrating, we then obtain 

‘°®l+ J H 

Ti ‘■0 ^ 

(139.16) 


as the desired relation between the vapour pressures and at tem- 
peratures Tx and T^. This expression is valid over any temperature 
range where the vapour can be treated as a perfect gas with a volume 
very large compare.d with that of the condensed phase; it can also be 
used over ranges where there is a tranration in state of the condensed 



608 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


phase — say from crystalline to liquid form — ^provided we give appro- 
priate treatment to the heat of transition. 

In order to combine the consequences of (139.15) with our previous 
expressions (139.9) and (139.10) for vapour pressures in a low-tempera- 
ture range we must note in the first place that we can obviously take 

= (139.16) 

We must also note, in the second place, that the two terms which 
appear next to the last in the expressions mentioned would go to zero 
at sufficiently lovr temperatures. It will then be readily seen that we 
can write as quite general expressions for the vapour pressure of a mon- 
atomic gas' 

iogp=-^- ilogr+ J^JjAO'dr' (139.17) 

and for the vapour pressure of a diatomic gas 

T r’ T' 

logj,= -A+|log^r+J /^C^dT" dT'-hi„ (139.18) 

0 ^0 

where the vapour-pressure constants and are given by 


*1 = log 




(139.19) 


and 


*2 = log(27rJw)*iJ 


&n^I ga 

A® ag; 


(139.20) 


A satisfactory comparison between these theoretical -values of the 
vapoTir-pressure constants and i^ and the empirical values for a con- 
siderable number of monatomic and diatomic vapours -will be found in 
Fowler’s Statistical Mechanics.^ We do not attempt any corresponding 
general treatment for polyatomic vapours, since the complexities of 
polyatomic molecules make specific treatment in each particular case 
ad-visable.| 


t Fowler, Staiiatical Mechanics, second edition, Cambridge, 1936, pp. 218-26. The 
notation used by Fowler may be connected with that used above by the following 
relations: 

For a crystal 

For a monatomic gas 

U!P) = iS ft* . MO) = Po = ff,. 

For a diatomic gas 

= n.(0) = ... = ffo. a=<T. 

t See Sterne, Phya. See. 39, 993 (1932), and 42, 656 (1932), for treatments of 3SH. 
and CH 4 . 



§ 139 


CHEMICAL EQUILIBRIA 


609 


(c) Chemical equilibria in gases. We may also investigate the con- 
ditions for chemical equilibrium in reacting gases with the help of 
the expressions for the thermodjmamic potentials of gases which are 
provided by statistical-mechanical considerations. Let us consider a 
mixture of gases -4, B, C, D, etc., at some given pressure p and tem- 
perature T, and let a chemical reaction of the general type 

aA+bB+... = cC+dD+... (139.21) 

be possible, where a mols of A react with b mols of B, etc., to give 
c mols of C plus d mols of jD, etc. In accordance with our general 
criterion for physical-chemical equilibria (139.2), we shall then have 
equilibrium with respect to this chemical reaction if the gases involved 
are present at such partial pressures that there would be no change in 
the thermodynamic potential of the system if the reaction in question 
should take place to an infinitesimal extent in either direction. Noting, 
for the case of perfect gases, that the thermodynamic potential for the 
system as a whole can be taken as the sum of expressions for the 
thermodynamic potentials of the individual gases at pressures equal 
to their partial pressures in the mixture, we can then write the con- 
dition for equilibrium in the form 

T)+bFs{n,Ps, 2 ")-!-... = cFc{n,pc, T)+dFj^{n,pi^, 

(139.22) 

where the symbols J^, jF^, J?cr, etc., denote the thermodynamic 
potentials of the same number of molecules n of the different gases at 
the temperature of the mixture T, and at pressures Pq, pj^, etc., 

which are equal to the partial pressures of the different gases in the 
mixture. 


With the help of this expression of the thermodynamic conditions for 
equilibrium, we can now apply statistical mechanics by substituting 
for the individual potentials Fq^ Fjy, etc., the values com- 

puted for them with the help of statistical mechanics. And since these 
potentials are themselves dependent on the partial pressures of the 
gases involved, we can then study what combinations of such pressures 
would correspond to equilibrium. We are thus provided with a general 
method for applying statistical mechanics to the treatment of chemical 
equilibria in gases. 

We may illustrate this general method by considering the two dif- 
ferent types of gaseous dissociation which would be expressed by the 


chemical reactions 


AB = A+B 


(139.23) 


and Ag = 2A, (139.24) 

3595.25 t 4 X 



610 FUBTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


•where diatomic molecules consisting respectively of two different or 
two similar atoms decompose. 

For the first of these reactions we can readily obtain, after some 
rearrangement and cancellation, •srith the help of the condition for 
equilibrium given by (139.22) and ■the expressions for the thermodynamic 
potentials of monatomic and diatomic gases given by (139.7) and 
(139.8), 


logKp = log 


PaPb 

Pab 


_ _ . U? ) _|_ I log I’+log 2 ^ e-*^*^+log 2 ^ 


Q-ejsIkT 


( 189 . 26 ) 

as an expression for tlie equilibrium constant which gives the rela- 
tion between equilibrium values of the partial pressures of the three 
gases. Similarly, for the second of the above reactions we obtain 


log.^ = log^ = — +t y +2 log 2 

^+Iog2. (139.26) 

where the additional term log 2 arises from the symmetry number 
<r = 2 for the molecule A^. The S 3 rmbolism in the above expressions 
win be self-explanatory when compared with the symbolism in the 
original formulae (139.7) and (139.8) for the thermodynamic potentials 
of monatomic and diatomic gases. 

The expressions mahe the equilibrium constant for the two kinds of 
dissociation reaction dependent, at a given temperature, on the energy 
of dissociation of the molecule in its ground state into atoms in -their 
ground states, the spectra of energy levels for the molecule and atoms, 
the molecular and atomic masses, -the moment of inertia and symmetry 
number of the molecule, and the quantum weights of the ground states 
of vibrational and electronic excitation. The symmetry number 2 
appears in the equilibrium expression for the case of like atoms in the 
manner that would be expected from a consideration of the possible 
firequendes of dissociation and recombination by collision. 

Expressions for equilibrium constants such as the two given above 
can also be rewritten, if desired, in a form containing empirical heat 
capacities of the reacting gases instead of summations over the internal 



§139 


CHEMICAL EQUILIBRIA 


611 


energy levek of the atoms and molecules involved. To do this we may 
make use of the purely thermodynamic equation of van’t Hoff for the 
dependence of equilibrium constants on temperature. In the case of 
a reaction between perfect gases as given by 

aA+bB+... = cC+dD+... (139.27) 

the van’t Hoff equation can be written in the form 


dT dT 


(139.28) 


where AH is the change in ‘heat content’ {E-\-pv) when the reaction as 
written takes place. As an expression for this cdiange in heat content 
as a function of temp^ratiue we can write 



where AHg is the change in heat content approached at very low tem- 
peratures, AC^ is the change in heat capacity when the reaction as 
written takes place, and A(7^, and ACp are introduced as symbols to 
denote respectively the constant part of this change in heat capacity 
at temperatures where translation and rotation are fidly excited, and 
the variable part which changes with temperature as vibrational and 
electronic energies become excited. Substituting (139.29) into (139.28), 
and integrating between and T^, we obtain for the ratio of eqtiilibrium 
constants at these two temperatures 

“ (139.30) 


, Z, AHg AHg , A<X., Tj , r 


We may now compare this purely thermod37namio expression with 
the statistical-mechanical expressions given by (139.26) and (139.26). 
Noting that AHg would be directly related to the reaction energy for 
molecules in their ground-levels, noting that M)pJB would be a more 
general expression for the coefficient of log T previously appearing, and 
noting that the terms in the statistical-mechanical expressions involving 
summations over internal levels would tend to zero at low tempera- 
tures, we may now conclude, even for more general cases than have 
been illustrated, that the equilibrium conditions for a gas reaction could 
be expressed in the form 

T T> 

logii= _^»+^logr+J 5isj[jA0ii3’']ir+O. (189.31) 



612 FUBTHEE APPLICATIONS TO THERMODYNAMICS Chap. XIV 

where (7 is a constant which can be evaluated from a knowledge of the 
masses, moments of inertia, symmetry numbers, and quantum weights 
of the ground-levels for the molecules involved in the reaction. 

In the case of a number of gas reactions involving monatomic and 
diatomic molecules, comparisons showing the agreement of the theoreti- 
cal and empirical values of the constant 0 will be found in Fowler’s 
Statistical Mechanics.^ We thus obtain a finally satisfactory outcome 
for that Inng development in the theory of physical-chemical equilibria 
which has gradually shown us how to supplement the mere require- 
ments of the first and second laws of thermodynamics by additional 
considerations. With this development we may always associate the 
names — ^among others — of Nemst, Planck, Sackur and Tetrode, Ehren- 
fest and Trkal, Iiewis, Giauque, and Fowler. 

(d) On the status of the so-C 2 dled third law of thermodynamics. We 
may now conclude this section with a few remarks as to the present 
status of that form of supplementary principle to which the name of 
third law of thermodynamics has become attached. During the past 
three decades various attempts have-been made to formulate a principle 
applying to the entropies of chemical substances which would provide 
consistent entropy zero-points for a given substance when considered 
in different states of aggregation or of chemical combination. A com- 
mon formulation has been the statement that all pure crystaUine sub- 
stances, including different crystalline forms of the same substance and 
including compounds as well as elements, could be assigned zero entropy 
at the absolute zero of temperature. As a consequence of this assump- 
tion it then became possible, by combinin^with data as to heat capa- 
cities and heats of reaction, to determine the conditions for chemical 
equilibrium at any desired temperature. 

In accordance with our previous discussions, see in particular § 137 (e), 
it is now evident, nevertheless, that it would not be correct to assign 
zero entropy in general to all actual crystals in the states which they 
approach as we go to the lowest temperatures available in the labora- 
tory. Indeed we have formd rather, with the help of the statistical- 
mechanical interpretation of entropy, that a rational and actually cor- 
rect and consistent assignment of entropy zero-points can be most easily 
obtained by taking the entropy of any system zero when it is known 
to be in a single pure quantum mechanical state. This, as we have seen, 
makes the low-temperature entropy of crystals equal to 
/Sr-o = <?o = »*Ioggfe, 

t Fowler, loo. eit., p. 228. 


(139.32) 



§139 


STATUS OP THIRD LAW 


613 


where (?o = jf" is the quantum mechanical weight for the condition of 
the crystal actually approached at low temperatures. And in agreement 
with this we do have definitely known cases, as already mentioned in 
§ 137 (e), where the change in entropy accompanying chemical reaction 
between crystals would not approach zero at low temperatures. The 
frequent validity or approximate validity of the so-caBed third law 
must now be regarded as due to the fact that the residual entropies of 
crystals as given by (139.32) do tend to be small and, in addition, to 
cancel or partially cancel when the entropies of the same substances 
in different crystalline forms are subtracted. 

In view of the theoretical soundness and simplicity of assigning^sero- 
points in such a way that the entropy for any system in a single pure 
state will be zero, and in view of the fact that this has been shown to give 
correct evaluations of the constants in the equations for gaseous equili- 
bria, we may now i^ard the so-called third law as quite properly replaced 
by this assignment of zero entropy to any system in a single pure state. 
We thus obtain what appears to be an entirely satisfactory principle. 

If actual crystals, when taken to temperatures far below those ap- 
proached in the laboratory and held for an infinite length of time, 
should of themselves go over into single pure states, our principle would 
of course also agree with the assignment of zero entropy to those 
crystalline fonns. Such an alternative statement of the principle — ^if 
valid — could in any case, however, be of only hypothetical interest, 
when we think of the necessities for isotopes to diffuse out into separate 
crystals, allotropic modifications to change into the most stable form, 
randomly oriented molecules to line themselves parallel, and all internal 
degeneracies of the molecules to become resolved, in order that such 
hypothetical crystalline forms should be realized. Hence we shall do 
well to refrain from any formulation of the principle other than that 
of assigning zero entropy to any system in a pure state, as has been 
justified by the considerations of statistical mechanics. 

140. Equilibrium between connected systems 

{a) Thermodynamic relations. Up to this point the thermodynamic 
formalism considered has been specially adapted to the treatment of 
equilibrium in so-called ‘closed’ systems, having a composition which is 
assumed not to be altered by transfer of matter to or from the outside. 
Such a formalism is perhaps sufficient for the treatment of any kind of 
problem that could arise, since it would always be possible to regard 
the boundary separatmg a system of interest from its surroundings 



614 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


as taken large enough to include any regions to or from which transfer 
might occur. Nevertheless, when we are specially interested in the 
principles governing equilibrium with respect to the transfer of matter, 
it proves convenient to make use of a thermod 3 mamic formalism 
specially adapted to the treatment of so-caUed ‘open’ systems, where 
explicit recognition is given to the possibility of changing the composition 
of a system of interest by the introduction or withdrawal of matter. 

To complete our account of the applications of statistical mechanics 
to thermodynamics, we must now give brief consideration to the 
statistical-mechanical explanation of the principles governing equili- 
brium, as to the transfer of matter as well as energy, when a system 
of interest is placed m contact with some other system to or from which 
substances of one or more kinds might be transferred. This will then 
be found to give us a statistical-mechanical justification for the thermo- 
dynamic formalism ordinarily used in treating ‘open’ systems, and will 
also be found to give us a justification for the thermodynamic principle 
that entropy is a quantity having extensive magnitude, so that the 
entropy of a homogeneous substance is to be taken proportional to the 
amount of the substance considered. 

It will prove advantageous if we begin by considering the thermo- 
dynamic formalism appropriate for the treatment of ‘open’ systems. 
This formalism was first obtained by Gibbst iu his memoir ‘On the 
Equilibrium of Heterogeneous Substances’, where special attention was 
given to the circumstance that the different systems or regions between 
which equilibrium is considered might themselves be different homo- 
geneous phases of the same substance or substances. 

Let us consider a system composed of h independent hinds of sub- 
stances or components^ out of which any further substances present can 
be regarded as formed — ^by chemical reaction if necessary — ^and let us 
take' the condition of the system at equilibrium as specified by its 
energy E, by the values ag,... of any external coordinates such 

as volume which have to be considered, and by the number of mols 
... for the h different components which the system contains. 
Treatmg these quantities as independent variables, we can then take 
the change b8 in the entropy of the system, when we make a small 
change from a given equilibrium condition to a neighbouring one, as 
given in accordance with the principles of the calculus by 


*■* = 51 *^+^*^+-+ 524 ^'+- 


( 140 . 1 ) 


t Gibbs, Trans. Conn. Acad. HI (1876-78). 



§140 


THERMODYNAMICS OF OPEN SYSTEMS 


615 


And, in accordance with the relation of change in entropy to the heat 
reversibly absorbed by a system of constant composition, we can re- 
write this in the form 


8S = — 


(140.2) 


where T is the temperature of the system and are the gene- 
ralized external forces corresponding to the coordinates In 

the common case, where the only external coordinate that has to be con- 
sidered is the volume of the system t;, and the corresponding external 
force is the pressure p, this expression can be rewritten in the less general 
form 

(140.3) 


SS = ^S)!+|8o+§_Slf,+...+ “ 82%. 


With, the help of the foregoing it is now easy to investigate the con- 
ditions for equOibrium when systems of the above kind are connected 
with each other so as to form a combined ^stem. Designating quan- 
tities applying to the individual systenas forming the combination by 
different numbers of accents, and for simplicity taking a case where 
volume is the only external coordinate involved, we can evidently write 


SS _ ^Sff+^S,/+||;S2f;+...4 


dS' 


-j-etc. 


T‘ 


dNi 

-Bv" +—SNI+.. 


eN^ 

88 " 


8N\ 


78 ^*+ 

ZN%+ 


(140.4) 


as an expression for the dependence of the entropy 8 of the whole 
combination on the variables characterizing its parts. In accordance 
with the second law of thermodynamics, however, if the energy and 
volume of the whole combination are held constant, the condition of 
equilibrium would be characterized by an adjustment of the entropy 
of the combination to its maximum possible value. This adjustment 
would of course have to be carried out with constant values for the 
total amounts of the various constituents present. Hence the condition 
of equilibrium for the combined system can now be obtained by settmg 
the above expression for 88 equal to zero, under the subsidiary conditions 
8E'+8B''-\-8E"+... = 0 
S»'-l-8w'-|-So"-l-... = 0 
BNi+SNl+8Nl+... = 0 


SN;,-^m'^+8Nl+... = 0 . 


(140.6) 



616 FUBTHEB APPLICATIONS TO THBBMODYNAMICS Chap. XIV 


This is then immediately seen to lead to 


T = 

T" 

= r = ..., 

(140.6) 

p = 

P 

== = ..., 

(140.7) 

d8[ 

d8{ 



dNi~ 

dNl 

~ dNl “■ 



; ; 

; ; ; ; 

(140.8) 


dSi 

dSn 

881 

m 

~SNl 

-dNl-- 


as an explicit description of the condition for equilibrium between the 
separate systems which are in connexion. The first of these equations 
(140.6) expresses the condition for thermal equilibrium, the second 
equation (140.7) expresses the condition for mechanical equilibrium, 
and the remaining equations (140.8) express the conditions for no 
tranter of the h different component substances between the iudividual 
^sterns which form the combination. 

Ri-milar treatment can of course also be given, when coordinates other 
than volume are involved, by giving appropriate consideration in each 
particular case to the nature of these coordinates and of the corre- 
sponding forces. This wQl then give us the conditions for mechanical 
equilibrium in a form appropriate to the particular case, and will leave 
the conditions for thermal equilibrium and for equilibrium with respect 
to the transfer of matter unaltered. 

In what follows, we ^all be specially interested in the conditions 
for equilibrium with respect to the transfer of component substances 
between the different individual systems. For any given component i 
tiie equilibrium condition for no transfer can be expressed by the state- 
ment that the partial derivative of the entropy with respect to the 
amount of that substance shall have the same value 




= const. 




(140.9) 


for each of the individual systems in the combination. In accordance 
with our original introduction of such quantities in (140.1), it will be 
seen that these derivatives are to be taken holding the energy, external 
coordinates, and amounts of other components constant. This has been 
indicated in the above expression by the addition of subscripts, where 
the symbol is to be taken as applying to aU components other than 
the component u IVom the tendency for entropy to increase towards 



§140 


THERMODYNAMICS OF OPEN SYSTEMS 


617 


a maximum, it will be seen in the absence of equilibrium that there 
will be a tendency for the substance % to be transferred from those 
individual systems in which the value of dS/dN^ is low to those in which 
it is high. 

For comparison with the actual s3rmbolism employed by Gibbs, and 
for convenience with respect to our later statistical-mechanical con- 
siderations, it will be advantageous also to express the foregoing in 
a somewhat different form. By solving our fundamental equation 
(140.2), for the variation in energy SJE instead of for the variation in 
entropy 8/S, we can write 

(140.10) 

and this can be expressed in the simpler form 

SE — T8S — (140.11) 


provided we define the quantity for any component i by 

^ W-^7E,CEi,a„«.,iV 


(140.12) 


The above quantities were called by Gibbs the potentials of the 
different component substances constituting a system. It will be seen 
jfrom the foregoing that equilibrium with respect to the transfer of any 
component i from one system to another can also be expressed by 
requiring the same value = const. (140.13) 


for the potential of that component in each of the individual systems. 
It will also be noted in the absence of equilibrium that there would be 
a tendency for substances to be transferred from systems where their 
potential is high to those where it is low. 

The relation of these potentials to partial derivatives of the energy 
or of the entropy of a system are given by the definition (140.12). It 
will also be convenient to have their relation to partial derivatives of 
the Helmholtz free energy of the system 

A = E^TS, (140.14) 

Varying the quantities in this equation, we have 

SA = 8E-TSS-^SST, (140.15) 

which by combining with (140.11) gives us 

Sj4 = — 8 ST — (. 418014 -^ 2^^2 (14:0.16) 

8596.25 4 X 



618 PURTHEB APPLICATIONS TO THERMODYNAMICS Chap. XIV 

We heace see that the potential for any component i can also be 
expressed by 

fii = 

as the partial derivative of free en^gy Tvith respect to the number of 
mols of the component i, holding temperature, external coordinates, 
and the other components constant. 

This form of expression for the potentials, together with the familiar 
principle that the change iu free energy accompanying a change in state 
at a given temperature is equal to the isothermal reversible work neces- 
sary to bring about the change, is convenient in drawing conclusions 
as to the factors controlling the values of the in specific cases. In 
accordance therewith — ^if we start out with a system consisting of the 
actual system of interest in which the value of fit is desired, together 
with some of the pure substance i at the same temperature T as that 
of the system of interest but in its ‘standard’ state of zero free energy — 
the potential fit of this substance in the system of interest will be equal 
to the isothermal reversible work per mol necessary to transfer a Hmall 
portion of the Substance from its ‘standard’ state into the system of 
interest, with the temperature and external coordinates for the latter 
hdd fixed. 

The above method of defining the potentials fii for the components 
of a system makes it possible to draw an important qualitative con- 
clusion, which will be of use in our later statistical-mechanical con- 
siderations, in § 141 (e), when we come to investigate the fluctuations 
in composition that would be expected in the grand canonical ATigAmLIftn 
that we shall introduce for the representation of open thermodynamic 
systems. If we consider a system maintained at constant temperature 
and volume, and provided with a semipermeable membrane through 
which any chosen component i could be reversibly introduced, say, in 
gaseous form, it is evident from our general knowledge of the factors 
controlling the ‘escaping tendency’ of substances from a system that 
we can usually expect the work per mol necessary to transfer the com- 
ponent i from its standard state into the system — and hence aIsa the 
corresponding potential — to increase markedly as we increase the 
amount of that component already present in the system, and to be 
much less affected at constant volume by similar changes that might 
be made in the amounts of the other components in the system. As 
a consequence we shall regard it as plausible to assume in typical 
cases that SfifjSNi '^^'ould be a positive quantity greater than zero if 


iw) ’ 



THERMODYNAMICS OP OPEN SYSTEMS 


619 


§ 140 


we change the amount of any component i by some given small incre- 
ment 8^ and then adjust the amounts of the other components j of 
the system so as to leave their potentials [jl^ unaltered, temperature 
and volume being held constant throughout, f Only under special cir- 
cumstances — e.g. with the component i present in coesdsting phases — 
could we expect the above quantity to approach zero. 

It will be of interest to consider a simple quantitative example of 
the above qualitative conclusion. For this purpose let us take a mixture 
of perfect gases at temperature T and volume v, and fix our attention 
on one of the gases i in this mixture. Let be the pressure of this 
component in its standard state at that temperature, and its partial 
pressure in the mixture. In accordance with the work necessary for 
reversible transfer from the standard state into the mixture, we can 
then evidently write for the potential in question 


Ui = RTlos^:JrBT, ( 140 . 18 ) 

Pi 

where the first term is the reversible work per mol necessary to raise 
the pressure of the component from to and the second term is the 
work per mol necessary for its reversible introduction through a semi- 
permeable membrane into the mixture. Be-expressing the partial pres- 
sure Pi, with the help of the gtis laws, in terms of the volume v of the 
mixture and the numbmr of mols Ni of the component imder considera- 
tion, we can rewrite this in the form 


[li = RT\og^^+BT. ( 140 . 19 ) 

Hence, in this special case, we see that the potential fii of each com- 
ponent i would depend only on the number of mols of that component 
present, and can write the simple result 


hfii _ BT 
Mi~^ 


( 140 . 20 ) 


for the change in any particular potential Hi with the corresponding 
number of mols Ni, the potentials (1$ of remaining components being 
kept constant. 

(6) The grand canonical ensemble. We are now ready to conmder the 
statistical-mechanical apparatus that will be needed in ex plaining the 


•j* This conclusion as to the dependence of potentials on composition at constant 
temperature and volume is not to be confused with the so-called Gibbs-Duhem relation 
holding at constant temperature and pressure, which can be expressed by the equation 



620 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


forgoing thermodynamic method of treating equilibrium with respect 
to the transfer of matter to or £rom a system of interest. Since we now 
contemplate the possibility of changes in the composition of a system 
of interest by amounts which will not be exactly known, it now proves 
desirable to introduce the idea of representative ensembles composed 
of members which can differ not only in state but also in the amounts 
of material of various kinds which they contain. Using the language of 
Gibbs, t such ensembles may be given the name of grand ensembles and 
the name oifetit ensembles may be used to designate our previous kind 
of ensemble where all the member systems are composed of the same 
numbers of molecules of the different kinds needed for the construction 
of the system. 

Since we shall be interested in equilibria, we may proceed at once 
to the definition of the grand canonical ensemble which provides the 
appropriate apparatus for treating equilibrium not only towards the 
transfer of energy but also towards the transfer of matter when systems 
are placed in contact. Let us consider a system composed of h ind&pen- 
demt kinds of svbstances or components, out of which any further sub- 
stances in the system can be regarded as composed. Let us use the 
symbols %, to designate the numbers of molecules of these 

components in any member of the corresponding representative en- 
semble. And let us use the symbol to designate the energy of such 
a member in any energy characteristic state m, where it wUl be noted 
that the q)ectrum of such values will depend on the composition of the 
partioular member of the ensemble under consideration. We may then 
define the grand canonical ensemble corresponding to such a system by 

the formula Q+AtiWi+-+A^*-a» 

~ ® ® » (140.21) 

where Pn,„.nkm ^ probability for finding a member of the ensemble 
with the composition described by the numbers of molecules n^.-.n/^ 
and in an energy state m which is possible for this composition, and 
where Q, 6, and lii — Hn sure adjustable parameters. 

We regard the above distribution as applying to all possible values — 
from zero to infinity — for the numbers of molecules % ... and to all 
possible values for the energy JI„. We shall, however, still regard the 
external coordinates, o^, Oj, a^, etc., such, for example, as volume, for 
the different members of the ensemble as all having the same values as 

t Gibbs, Elemeataiy Princ^lea in StaHstical Mechanics, Yale University Press, 1902, 
866 cliapter xv. Tlio troatsoLent of gTSud ensembles given in tbe present section is a 
condensed (Quantum mecbanical transcription of the classical treatment given by Gribbs 
rather than a complete and detailed development. 



§140 


GRAND CANONICAL ENSEMBLES 


621 


pertain in the system of interest itself. In typical cases there Tvill then 
be a high concentration of probability in the neighbourhood of the 
mean composition and mean energy for the members of the ensemble, 
as we shall see later, § 141 (e). 

Important properties of the grand canonical distribution are ex- 
pressed by the following equations. Since the total probability for 
fin di ng one or another number of molecules and one or another value 
of energy will be normalized to unity, we have 

2 e e =1, (140.22) 

ni..,nh»m 

when summed over all compositions % ... % and over all corresponding 
states m. For the mean energy of the members of the ensemble, we have 




(140.23) 


W = e 

[For the mean numbers of molecules for the different kinds of components, 
we have _ Q+?t,ni+...+wtnA-ii;m 

5^1 = 2 ® ^ % 


(140.24) 


%= 2 e %. 

ni...nA,m 

And for the value of the quantity E, which we define in analogy to that 
for a petit ensemble, we have 


e 


s= 2 « 

ni...nh,m _ 


e 


(140.25) 


It wiU be noted that the grand canonical ensemble can be regarded 
as consisting of a weighted collection of petit canonical ensembles, one 
for each possible composition. The grand ensemble will therefore itself 
be in a condition of equilibrium corresponding to that which could be 
regarded as reached as the ultimate result of long-continued essentially 
isolated behaviour. It will also be noted that the above expressions 
(140.22-4) provide in all ^-t-2 equations which could be solved for the 
parameters Q^6, ... describiog the distribution, in terms of the 

mean energy E and mean numbers of molecules ... % of the members 
of the ensemble. By correlatiag these mean values with the definite 
values of energy and composition which would be assigned to a ^tem 



622 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 

of interest from the point of view of thermodynamics, it then becomes 
possible, as we shall see, to regard the ensemble as representing thermo- 
dynamic equilibrixim for a system of specified energy and composition. 

(c) Correlation of thermodynamic and statistical-mechanical quan- 
tities. Assuming this representation of thermodynamic equilibrium, 
the complete correlation of thermodynamic with statistical-mechanical 
quantities can now be obtained — as in our previous treatment of petit 
canonical ensembles — ^by considering the variation in H, when we pass 
from any given grand canonical distribution to a similar neighbouring 
distribution, corresponding to a change in the system of interest from 
one condition of thermodynamic equilibrium to a similar neighbouring 
condition, with slightly altered values for the energy, composition, and 
exteimil coordinates pertaining thereto. Making use of the expression 
for S given by (140.26), we immediately obtain for the variation in 
question 

Sl = (140.26) 

We shall be interested, however, in putting this expression into a more 
significant form. 

Since we are making a change to a new grand canonical distribution, 
our variation must be made in such a way as to preserve the validity 
of equation (140.22) which makes the total probability of finding a 
member of the ensemble in some possible condition equal to unity. 
Hence the above variation must be such as to satisfy 



(140.27) 


where the change in the eigenvalues has been appropriately ex- 
pressed in terms of their dependence on the external coordinates a^, 
etc., such, for example, as volume. Defining the external forces A-i, 
etc., corresponding to these coordinates as in (121.6), and making 
use of the general method of computing mean values, the above equa- 
tion can be rewritten in the form 


(140.28) 



623 


§ 140 COREELATION WITH THERMODYNAMICS 

And combining with (140.26), this then gives ns 

SH = — - — [-^ — g(^i8%+-3^2S^2+”-)' (140,29) 

For the purpose of making the desired correlations, this result may 
now be re-expressed in the more familiar form 

SE = -eSS -(Ii8cJi+l2S%+..0+/^i8^i+.*.+A^;i8% (140.30) 

and, compared with our previous fundamental thermodynamic relation 
(140.11), 

8E = TSS — (-4iSai+-^2 8^2"t"-'')+i^i8^4-...+^A8^. (140.31) 

In the light of our earher correlations between thermodynamic and 
statistical-mechanical quantities under circumstances where changes in 
the composition of the system were not contemplated, we may now 
take the thermodynamic quantities — energy E, temperature entropy 
8^ potential of the ith component number of mols of that com- 
ponent Ni, external force of the Jth kind and corresponding external 
coordinate — ^for a system in thennodynamic equilibrium of the so- 
called ‘open’ kind where changes in composition are contemplated — 
as correlated with quantities pertaining to the corresponding grand 
canonical ensemble by the expressions 

(140.32) 

where k is Boltzmann’s constant and Nj^ is Avogadro’s number. For 
future reference it will also be useful for ns to note, in accordance with 
the above and with (140.26), that we can rewrite the correlation of 
entropy 8 with statistical-mechanical quantities in the form 

(140.33) 

6 

(d) Conditions for equilibrium when grand canonical ensembles are 
combined. We shall now Ulustrate the appropriate character of the 
new statistical-mechanical apparatus which has been iutroduced, by 
showing that the conditions for statistical equilibrium when two dif- 
ferent grand canonical ensembles are placed in connexion with each 
other would give an explanation of the conditions for thennodynamic 
equilibrium when the corresponding ^sterns are placed in coimexion. 



624 FURTHEB APPLICATIONS TO THEBMODYNAMICS Chap. XIV 


For this purpose let us consider two grand canonical ensembles with 
the distributions 


+ »»« — jEm* 

PnL.ni,m' = « »' (140.34) 


and 




e 


(140.35) 


where we use accents to distinguish quantities pertaining respectively 
to the two distributions. We may regard these two ensembles as repre- 
senting two separate systems of interest each in a condition of thermo- 
dynamic equilibrium. Let us then consider a third ensemble, of higher 
order in the number of its members, which could be obtained by taking 
each member of the one ensemble in conjunction with each member of 
the other. We may regard the new ensemble as representing the two 
systems of interest taken in conjunction. 

Assuming for the moment that the members of the two original 
ensembles are placed in thermal contact but not in cormexion with 
each other, let us first consider the conditions for thermal equilibrium. 
Each of the two original grand canonical ensembles is itself a collection 
of petit canonical ensembles, one for each possible composition, nor- 
malized to correspond to the representation of that composition in the 
whole ensemble, and distributed in a manner determined by the para- 
meters 6' and in the two eases respectively. We can hence apply 
our previous conriderations, § 127, as to thermal flow when the members 
of canonical ensembles are placed in contact and conclude also for grand 
canonical ensembles that the condition for no thermal flow in the mean 
will be expressed by the equality 


B' = 0". (140.36) 

Assuming next that the members of the ensembles which have been 
combined are allowed to exert mechanical forces on each other, let us 
consider the conditions for mechanical equilibrium. As an example we 
may assume the introduction of a movable partition which would allow 
the combined members to exert pressure on each other. As an appro- 
priate expression for an average condition of mechanical equilibrium in 
the combined ensemble, it will then be reasonable to require sm equality 

(140.37) 

between the mean values of the pressures in the members of the two 
ensembles, thus assuring that there would be no tendency in the mean 
for the movable partition to be displaced in either direction. Appro- 
priate consideration could also be given to other Triuda of external forte. 



§140 


COMBINED GRAND CANONICAL ENSEMBLES 


625 


with the general result that we should require the same relation for 
the mean values of such forces in the ensemble as would be required 
in a thermodynamic treatment of equilibrium for the definite values 
which would then be assigned to those forces. 

Finally, considering that connexion could be established allowing the 
passage of matter between the members of the two ensembles, let us study 
the conditions for equilibrium towards the transfer of component substances. 
To investigate this, let us take the necessary conditions for thermal and 
mechanical equilibrium as already guaranteed, and, by combining the 
expressions for probabihty given by (140.34) and (140.35), write 


~ ^ ^ j (140.38) 

where 6 is the common value of that parameter in the two original 
distributions, as an expression describing the distribution in the com- 
bined ensemble when it is set up. Such a distribution would not in 
general correspond to equilibrium towards the transfer of matter. How- 
ever, let us now write 

= n[+ni 


(140.39) 


as expressions for the total numbers of molecules for the h components 
in the members of the combined ensemble, put 

£1 = Q'+Q", (140.40) 

and take the possible energies for the interconnected system as 
approximately given by (140.41) 

as will be justified for most en^gy states in typical cases for systems 
of large volume. And, furthermore, let us require the equalities 

= IH 


(140.42) 


as holding between the values of these parameters m the two original 
ensembles. Our expression (140.38) for the initial distribution would then 
lead — on the gradual establishment of interconnexion (see §97 (6)) — to 

n+>tltH+—+WW»— C m 

' . (>«>■*») 
as a close expression for the distribution in the resulting ensemble. 
This would be a grand canonical distribution corresponding to equili- 

8585.25 4 



626 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 

brium at temperature T ^djk, and exhibiting no dependence of proba- 
bilities on the separate values of the and Hence we take the 
equalities (140.42) as the condition for equilibrium with respect to the 
transfer of matter. 

This then gives our statistical-mechanical explanation of the thermo- 
dynamic principles governing equilibrium when systems are placed 
in connexion, since the foregoing requirements (140.36), (140.37), and 
(140.42) for statistical equilibrium are seen to be natural analogues of 
our previous thermodynamic requirements (140.6), (140.7), and (140.13) 
for thermodynamic equilibrium which can be written in the form 

and [A- (140.44) 

To complete our justification for the use of the grand canonical 
ensemble to represent a thermodynamic system iu a condition of equi- 
librium it would also be necessary to show in typical cases that the 
fluctuations of composition in the ensemble would be small. This we 
shall do iu the next section in § 141 (e), when we are considering fluctua- 
tions iu general. 

(e) Explanation of the Gibbs paradox. As a further example of the 
usefulness of the grand canonical ensemble, we may consider the ex- 
planation which it provides for the so-called Gibbs paradox as to the 
entropy of combined systems. When two similar systems are placed 
in connexion with each other, for example when two samples of the 
same pure gas at a given temperature and pressure are allowed to 
mingle by the removal of a separating partition, the entropy of the 
resulting system is taken as equal to the sum of the entropies of the 
two original parts. This valid thermodynamic procedure, however, has 
sometunes seemed paradoxical from the point of view of molecular 
mechanics, since the entropy would have been regarded as subject to 
increase by diffusion if the two separate gases had been composed of 
different kinds of molecules, and it has seemed strange that this increase 
in entropy should be abolished merely because the two kinds of mole- 
cules that diffuse into each other happen to be of the same kind. 

The complete statistical-mechanical explanation of the thermo- 
dynamic principle, that the entropy of a homogeneous substance is 
proportional to the amount taken, depends on the recognition of two 
factors. The first of these factors is the consideration that individual 
molecules of the same substance when allowed free motion inside a 
common container are not to be treated as distinguishable one from 
another in the calculation of entropy. The second of the factors is the 



§ 140 


THE GIBBS PAEADOX 


627 


consideration that the grand rather than the petit canonical ensemble 
must be taken as providing the appropriate representation for a system 
in thermodynamic equilibrium when changes in its composition are 
contemplated. 

In the classical development of statistical mechanics by Gibbs, the 
first of the above factors was allowed for by the introduction of a special 
step in which attention was turned away from the specific phases of 
a system, which depend on just which particular molecules are assigned 
to different regions in the ja-space, to its generic phases^ which depend 
only on the numbers of molecules so assigned. In the present quantum 
mechanical development of statistical mechanics no special step of this 
kind has to be introduced since the quantum mechanics per se already 
allows for the consideration that individual molecules of the same sub- 
stance are not in general distinguishable one from another.f 

The allowance for this first factor is alone sufficient to lead to a calcu- 
lated entropy for a homogeneous substance which is nearly but not 
quite proportional to the amount taken. To illustrate this let us con- 
sider the correlation of entropy vrfth statistical-mechanical quantities, 

(140.46) 


which we haTe made at the stage of omr development when allowance 
is made for the indistiDguishabihiy of like molecules bat when we ace 
treating ‘closed’ rather than ‘open’ systems and are not considering 
the possibility of change in composition. In accordance with our 
representation of a ‘dosed’ system at thermodynamic equilibrium by 
a canonical ensemble, this then gives us, in agreement with (120.6) 


and (120.6), 




(140.46) 


for the entropy of a system at equilibrium, where B is the mean energy 
to be expected for the system and the summation is over aU possible 
energy states m. In the case of a homogeneous substance it is im- 
mediately evident that the first of the two terms in this expression is 
indeed proportional to the amount of the substance taken. With freely 
moving molecules, however, the second term in (140.46) is not quite 
proportional to that amount. Por example, in the case of a perfect 

t An fllmninating discussion of the close relation, between the introduction of generic 
phases by Gibbs and the present procedures €wiopted in the quantum mechanics is given 
by Epstein in his article, ‘Gibbs’ Methods in Quantum Statistics’, published in Com- 
mentary on the Scientific Writings ofJ . Willard Q^bs, vol. ii, Yale University Press, 1936. 



628 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


monatomic gas, consisting of n atoms of mass m at temperature T and 
in volume v, we have, in agreement with (136.6), 

jfclog 2 = nk^og^^^{27mkT)i - ibg V{2im)j. (140.47) 


Thia is then seen to be proportional to the amount of gas taken except 
for a term which becomes negligible as we take large enough amounts 
of the substance. It may be remarked in a qualitative way that the 
failure of this procedure to give strict proportionality is due to a failure 
to take proper account of the possibility for fluctuations in the amount 
of gas that could be present in the two parts of the whole container when 
two original samples of the gas are connected. It may also be remarked, 
however, that the deviation j&om strict proportionality is too small 
in typical cases to make any correction for it necessary in practical 
computations. 

Knally, we may now turn to the completely appropriate apparatus 
for the treatment of ‘open’ thermodynamic systems, which makes due 
allowance both for the indistinguishability of like molecules and for the 
possibility of changes in composition. We then correlate entropy with 
statistical-mechanical quantities by 


S ^ — ^^giand* (140.48) 

And, in accordance with our use of the grand canonical ensemble for 
the repr^ntation of equilibrium, we then obtain, in agreement with 


(140.33), 


8 Q 

T T T 


(140.49) 


for the entropy of a homogeneous substance, where E is the mean 
energy to be expected for the substance, and the mean composition to 
be expected for the system is specified by the mean numbers of mole- 
cules ^ % for the different components. This expression then indeed 

makes the entropy strictly proportional to the amount of substance 
taken, since we see from the conditions for equilibrium, when portions 
of the substance are combined, that the quantities — would be 
independent of the amount of substance in agreement with (140.42), 
and that the quantity Q would be proportional to the amount taken 
in agreement with (140.40). We thus obtain a satisfactory statistical- 
mechanical correlate for the proportionality of the entropy of a homo- 
geneous substance to its amount. 

Except for our later consideration of fluctuations in a grand canonical 
ensemble in § 141 (e) of the next section, this must now complete our 
partial treatment of grand ensembles and their use in the representa- 



§140 


THE GIBBS PAEADOX 


629 


tion of *open’ systems. Once more we emphasize the fnndamentality 
of the classical considerations of Gibbs and their ready applicability 
for the development of a quantum statistical mechanics. 

141. Fluctuations at thermodynamic equilibrium 

The general explanation of the principles of thermodynamics in terms 
of statistical mechanics, which was given in the preceding chapter, and 
the further applications which have been undertaken in the foregoing 
parts of the present chapter, have been based on the idea that the 
thermodynamic behaviour of a system of interest can be correlated 
with the mean behaviour of the similar systems composing an appro- 
priate representative ensemble, and that the values of thermodynamic 
quantities applying to the system of interest can be correlated with the 
mean values of suitable quantities for the members of the ensemble. 
Thus, for example, the thermodynamic quantities for a ‘closed’ system 
in thermodynamic equilibrium, which were selected for special study 
in the foregoing, the pressure jp, energy heat capacity CJ,, and entropy 
8, can be regarded from the statistical-mechanical point of view as the 
mean values of the analogous mechanical quanl^ies in ^e correspond 
iog canonical ensemble, namely p = dEjdv^ G = hdEldOy and 
respectively. Hence the actual precision of the principles of thermo- 
dynamics when applied under typical circumstances must depend on 
the small fluctuations around the mean which would be exhibited under 
those conditions by the mechanical analogues of thermodynamic quanti- 
ties; and a need for amplifying or modifying the methods of thermo- 
dynamics would arise whenever the fluctuations become important. 

For this reason we shall now give a brief although partial accoimt 
of the fluctuations to be expected in thermodynamic systems. We shall 
confine our attention to systems in thermodjniamic equilibrium, and 
hence base our calculations on the canonical or grand canonical dis- 
tribution as giving the appropriate representation of the condition to 
be investigated. 

By way of summary we shall first present once more the results 
already calculated for the fluctuations in the numbers of molecules or 
other component elenients in different states in the case of Maxwell- 
Boltzmann, Einstein-Bose, and Fermi-Dirac systems. These results, 
however, are of course only indirectly related to fluctuations in the 
values of the macroscopic variables of thermodynamics. We shall next 
consider fluctuations in the total energy of a system in thermod 3 niaimc 
equilibrium at a specified temperature. This is of immediate thermo- 



630 FUBTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


dynamic interest since it is only when these fluctuations are small that 
the usual thermodynamic procedure of simultaneously specifying the 
temperature and the energy of a system is possible. We shall then 
consider fluctuations in the external forces exerted by a S3n3tem in 
equilibrium. This is also of immediate thermodynamic interest in con- 
nexion with the thermodynamic procedure of assigning precise values 
to such quantities as pressure. We shall then turn to a consideration 
of the justification for Einstein’s somewhat approximate procedure of 
relating fluctuations in macroscopic variables describing a system to 
corresponding changes in free energy. The Einstein treatment is of 
interest because of its ready applicability to problems of observational 
signiflcance. Finally, we shall consider the fluctuations in composition for 
an open thermodynamic system that would correspond to its representa- 
tion by a grand canonical ensemble. This will be of interest in justifying 
the uses that we have made of the grand canonical ensemble to represent 
systems having precisely specified compositions from the thermodynamic 
point of view, and also in providing a method for treating the important 
observational problem of the fluctuations in density of a fluid. 

(a) Fluctuations in the case of Maxwell-Boltzmann, Einstein-Bose, 
and Fermi-Dirac distributions. In the case of Maxwell-Boltzmann, 
Einstein-Bose, and Fermi-Dirac systems we have seen in §§ 113 and 114, 
for a system composed of n similar weakly interacting elements, that 
the mean number of elements ij. in a given energy state k at equilibrium 
would be given by the respective formulae 


% = (Maxwell-Boltzmann), 

(141.1) 

^ (S^tein-Bose), 

(141.2) 


(141.3) 


where a and )8 are constants whose significance we already understand, 
and ejc is the energy of the state k. We have also found that the mean 
square deviations from these numbers would be given by the re^ective 
formulae ===== 

^ (Maxwefl-Boltzmann), (141.4) 






— = J- -1- 1 (Einstein-Bose), 




% 


(% %) _ ^ _ 1 (Fermi-Dirac), 


% 


'k 


(141.6) 


(141.6) 



§141 FLUCTUATIONS IN MOLECULAE DISTRIBUTION 631 

where we restrict ourselves to cases for which n ^ since we have 
dropped a term — 1/n in (141.4), and have used grand canonical en- 
sembles in computing (141.6) and (141.6) instead of petit canonical 
ensembles corresponding to a definite total number of elements n. 

We repeat these formulae here for the sake of inclusion along with 
other expressions for various kinds of fluctuations. It may be remarked 
that the methods which we used for their derivation in §§113 and 114 
could be easily extended (by the performance of further diff erentia- 
tions) to a calculation of fluctuations of order higher than the second; 
but this would not add much to our immediate physical understanding. 
It will be noted as the most important characteristic of the mean square 
fluctuations that they become unimportant in any case for states h that 
are highly populated. 

The expression (141.4) for the fluctuations in a Maxwell-Boltzmann 
system is of the well-known ^classical’ form. It may be used for 
investigating the fluctuations in the equilibrium distribution of radia- 
tion when this is treated as corresponding to the excitation of modes 
of electromagnetic oscillation in a hollow enclosure. 

The expression (141 .6) for the fluctuations in an Einstein-Bose system 
was used by Einsteinf for obtaining an alternative calculation of the 
fluctuations in radiation when treated as consisting of photons. It was 
emphasized by Einstein in this connexion that the first term in (141.5) 
could be regarded as corresponding to the fluctuations to be expected 
in a gas consisting of distinguishable particles and that the second term 
could be regarded as corresponding to fluctuations arising from the wave- 
like character of light, thus showing that even in its fluctuation properties 
light could be regarded as havii^ a dual wave-particle character. 

The expression (141.6) for the fluctuations in a Eermi-Dirac system 
was first obtained by Pauli. J It will be noted in this case, as a conse- 
quence of the Pauli exclusion principle, that the largest possible mean 
population of elements in any state To would be unity and that the 
fluctuation then goes to zero. Thus in the case of conduction electrons, 
which can be treated as giving a highly degenerate Eermi-Birac gas 
even up to ordinary temperatures, we have very small fluctuations in 
the occupation of those low-lying energy states which are nearly com- 
pletely filled. 

(6) Fluctuations in total energy. We may now turn to a consideration 
of the fluctuations in the total energy of a ^closed’ system in thermo- 

t Einstein, Berl, Ber. (1924), p. 261 ; (1925), p. 3. 

X Pauli, Zeita.f. Phya, 41, 81 (1927). 



632 FXJBTHEE APPLICATIONS TO THERMODYNAMICS Chap. XIV 

dynamic equilibrium at a specified temperature. Such a system would 
be represented by a canonical ensemble with members distributed over 
different energies, in correspondence with the circumstance that a 
system in thermal equilibrium with a large heat bath could assume 
different values of energy by iuterchange with that bath. 

To obtain a suitable expression for the mean square deviation iu 
energy it is convenient to start with the expression for the mean energy 
of the members of the canonical ensemble 




6 ® 


(141.7) 


and consider the effect of a small variation S0, correspondiag to a change 
to a neighbouring condition of equilibrium of slightly altered tempera- 
ture. Carrying out such a variation, without change in the volume or 
other external parameters for the system, we obtain for the variation 
in mean energy 

d 




/ 


(141.8) 


Furthermore, since the total probability 

. TeV = i 


(141.9) 


for finding the system in one or another state must remain constant, 
we can also write 


as a necessary condition that must be satisfied in m airing the variation 
86. Combining (141.8) with (141.10), we have 


8^ = i^^86-^ 


86 


or 


(F8)-(J)g 6g 8J 
(f)a “(f)2S6‘ 


(141.11) 


Re-exprOTsmg the left-hand side by an obvious transformation, and 
re-expressing the right-hand side by translation into thermodynamic 
language, we then obtain the desired expression for the mean square 
fluctuations in energy. 



FLUCTUATIONS IN ENERGY 


633 


§ 141 

where is the heat capacity of the system, its volume and other 
external parameters being held constant. 

We may first consider the application of this result to the typical 
Tiigh-tempcTOluTB situation, where the energy E and heat capacity of 
the system would be approximately given by equations of the form 

E Si nTcT and « nh, (141.13) 

where is a number of the order of the number of degrees of fi-eedom of 
the system. Substituting in (141.12), we then obtain an expression for 
the fractional mean square deviation in energy. 


(J )2 "" n 


(141.14) 


which is seen to go to zero as the number of degrees of fireedom of the 
system increases. This is the characteristic kind of result that we can 
expect in the usual cases where we should wish to apply thermo- 
dynamics. It shows for a system of many degrees of freedom, which 
is at equilibrium at a given temperature, that the probabilities for the 
different possible values of energy would be highly concentrated in 
the neighbourhood of the mean energy. It thus provides a justification 
for the thermodynamic procedure of simultaneously assigning definite 
values both to the temperature and to the energy of a system under 
such circumstances. It also provides the grounds for the familiar state- 
ment that the principles of thermodynamics may be regarded as 
becoming exact at the limit of an infinite number of degrees of freedom. 

We may next consider the application of our formula for fluctuations 
in energy to those special situations which arise when a system of 
interest is in a condition to absorb large quantities of energy without 
appreciable change in temperature. Such a state of affairs would occur 
in the case of a system containing two different phases — say, for 
example, solid and liquid — ^where an absorption of energy would be 
primarily used for supplying the heat of transition from one phase to 
the other rather than in producing a rise in temperature. Under such 
circumstances the heat capacity for the system would become 
exceedingly large during the transition, and our formula (141.12) 
would lead to large fluctuations in energy. This finding, which was first 
appreciated by Gibbs, f again demonstrates the successful character of 
the statistical-meohanical explanation of thermodynamics smce the 


t Gibbs, EUm&niary FrincipUa in Statiaticctt Mechanics, Yale University Press, 1902; 
see in particuUu: the second footnote on p. 75. 

3595.25 4 ^ 



634 FX7BTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


large fluctuations in energy now predicted for a canonical ensemble 
from the statistical viewpoint would occur under those very circum- 
stances where the energy of a system would not be very sharply deter- 
mined by its temperature from the thermodynamic viewpoint. 

Finally, we may consider the application of our formula for energy 
fluctuations to gucmtum mecJuinical situations, where the heat capacity 
of a system is increasing rapidly with temperature owing to the presence 
of partially excited degrees of freedom. As an illustration we shall take 
the case of a crystal at very low temperature where the heat capacity 
is increasing as the cube of the temperature owing to the gradual 
excitation of its modes of oscillation. In accordance with (137.29) and 
(137.30), we can then take the energy and heat capacity of a crystal 
composed of n particles as given by 


m Sij*nkT* 


121T* nhT^ 


(141.16) 


where 0 is a parameter which characterizes the crystal, and we have 
chosen the energy zero-point so as to make the energy go to zero with 
the temperature. Substituting into our formula (141.12), we then obtain 


(^_J0)2 _ ^ /©y 1 

(f)a Zir^yT} n 


(141.16) 


as an expression for the energy fluctuations at temperature T. Here 
again we find for any given temperature T that the fluctuations would 
become negligible as we make the number of particles and hence the 
number of degrees of freedom greater. Nevertheless, for any given 
number of particles n, we would always have the possibility of con- 
sidering temperatures T low enough so that the energy fluctuations 
would become exceedingly large. Hence at very low temperatures we 
must regard our usual thermodynamic concepts, which in general make 
energy and temperature simultaneously determinable quantities, as 
subject to limitations even for a system of many* degrees of freedom 
and over a wide range of possible conditions.t 
In conclusion, it is of course not to be forgotten that the simultaneous 
assignment of values to the energy and to the temperature of a system 
becomes in any case impossible when the number of degrees of freedom 
of the system is very small. This agrees with the large percentage 
fluctuations in energy which we should expect in the case of a very 
small system kept in contact with a heat bath. 


t Compare Planck, Theory of Beat, translated by Brose^' Macmillan, 1932, § 136, 

p. 268. 



FLUCTUATIONS IN EXTERNAL FORCES 


635 


§ 141 

(c) Fluctuations in an external force. We may next turn to the 
fluctuations that would be expected at equilibrium in an external force 
A exerted by a system corresponding to displacements in some general- 
ized external coordinate a. As a definition (121.6) for the external force 
when the system is in a given energy state n we have 


A = 

”■ da’ 


(141.17) 

and hence for the mean force in a canonical ensemble we may write 

(141.18) 




To obtain the desired expression for the fluctuations in A, we may 

start with the necessary expression, holding in the case of a canonical 

ensemble, „ 

’ Xe « =1. (141.19) 

n 

Bearding the distribution as dependent on the ‘temperature’ B and on 
the external coordinate a, we may then carry out a partial diSerentia- 
tion with respect to the latter quantity. After multiplying by 6, this 
gives us = 0 

^ \da da ) 


(141.20) 


Maldug use of (141.17), this provides in passing the useful relation 

g = (141.21) 

Differentiating (141.20) a second time, we then obtain 

= 0 , 

which, with the help of (141.17) and (141.21), can be written in the form 


\li^. 

8BInY 



I6\da 

8a j 

^8a^~ 

~ 8a‘j 





JA^' 

da^ 

^ 8a . 


(141.22) 


TTiifl then gives us the desired expression for the mean square devia- 
tions in A, 

= (141.23) 

wbicb can also be rewritten in the form 



636 FXJRTHBB APPLICATIONS TO THERMODYNAMICS Chap. XIV 

TTia term dS/da in this expression has an immediate macroscopic 
interpretation as the change in external force with the associated 
generalized coordinate. The other term {BA Ida) has no such immediate 
interpretation and must be calculated in any specific case from its 
relation to the mean value of B^EjBa^. Hence we cannot make the 
unqualified assertion that fluctuations in all kinds of external forces 
wfil be small. 

A specific calculation of the fluctuations in the pressure p of a perfect 
gas has been made with the help of the above formula by Fowler.f 
Taking conditions where the calculation could be made classically, and 
■maTring reasonable assumptions as to the 14w of force for the molecules 
at the walls, he was led to fluctuations of the negligible amount 

(2* ~ 4.7 y 10~^® 

(1)2 - 47 X 10 

for a cubic centimetre of gas under standard conditions. This result 
is of the order l/»*, where n is the number of molecules of gas. 

In closing these remarks on the rigorous calculation of fluctuations 
in canonical ensembles, it may be mentioned that the methods used 
above may be readily extended to the calculation of fluctuations in 

energy {E—E)^ of any desired order n, and to fluctuations for quantities 
involviog external forces of the types 

{A — A){E — E) and {A^ — A]^)(A 2 — 

where A^ and A^ are difrerent external forces. J 

(d) Einstein’s formula for fluctuations in a macroscopic v 2 U'iable. In 
addition to the forgoing rigorous expressions for the treatment of 
fluctuations in canonical ensembles, a somewhat approximate but im- 
portant practical method of treating the fluctuations in a macroscopic 
variable describing the condition of a system has been devised by 
'FiiTiRtein.ll We shall continue the present section by showing that Ein- 
stein’s formula — cormectiog the probability of a fluctuation with a 
oorrespondiog increase in free energy — can also be made plausible on 
the basis of otir pr^nt correlation of thermodynamic with statistical- 

t Fowler, StaHstical Mechanics, second edition, Cambridge, 1936, chapter xx, p. 756. 
This chapter idso gives a treatment, p. 775, to the fluctuations in the time integral aa 
well as in the instantaneous value of pressure, and may be consulted in general for the 
speciflc treatment of many important fluctuation problems. 

t See Gibbs, Elementary Principles in Stc^isticdl Mechanics, Yale TJniversity Press, 
1902, chapter vii. 

j| Einstein, Ann, der Phys, 33, 1275 (1910). 



§ 141 EINSTEIN TREATMENT OF FLUCTUATIONS 637 

mecliaiiical quantities, as well as on the basis of the Boltzmann correla- 
tion between entropy and probability that was used by Binstein. 

The Einstein formula is to be taken as applying to the fluctuations 
at equilibrium, around its mean value Xq, in some variable x^ which 
describes the condition of the system of interest from a sufficiently 
large-scale point of view, so that it would be possible to regard this 
quantity as macroscopically controllable and appropriate to consider 
the dependence of the free energy of the system on such a control if it 
were introduced. According to circumstances, this control might apply 
to the actual value of the variable or to some limit placed on its values. 
As an example we could consider the fluctuations, around its mean 
value Vq in the volume v of some selected portion of a fluid, taken large 
enough so that it would be appropriate to consider the different free 
energies of the system if the volume v of the portion were controlled 
to different upper limits by the introduction of a suitable adjustable 
partition. 

To treat the equilibrium fluctuations in such a variable x, we may 
regard the equilibrium condition of the system as represented by a 
canonical ensemble, and hence may take the probability for any state 
n of the system as given by an expression of the form 

Pn = e e ^ (141.26) 

where ^ and 6 are parameters and is the energy of the state n. This 
formula applies to all possible states n of the system and hence to 
states corresponding to all possible values of the freely fluctuating 
variable that we are considering. Included among these states n will 
be those — designated, say, by the symbol n(x ) — which would be com- 
patible with any definite value x at which we might introduce a control 
for that variable. For example, returning to our previous illustration, 
where the fluctuating variable is taken as the volume of a portion of 
a fluid, there would be included among all possible states n of the fluid 
those states n{xi) which would be compatible with the enclosure of that 
portion of the fluid inside an inextensible membrane of constant volume 
which would control the actual volume of the portion to the upper 
limiting value v. For the total probability of all states n{x) which 
would be compatible with such a controlled value of the variable a;, 
we can evidently write 

P{x)= X = 0 2 (141.26) 

where we sum over all those states n = n(x) which are, in fact, com- 



638 FURTHEB APPLICATIONS TO THEBMODYNAMICS Chap. XIV 

patible with that assigned value, and O' is a constant independent of 
that value. 

This expression for the total probability of the states n{x) under 
consideration may now easily be re-expressed in thermodynamic lan- 
guage by writing 

P{x) = (7exp(log 2 

^ n*n(a:) ' 

= C'exp{-^(a:)/*T}, (141.27) 

where, in accordance with (122.9), the quantity A{x) is seen to be the 
free energy that would be ascribed to the system as a 'whole if the 
variable consideration were definitely controlled at the value x 

instead of being allowed to fluctuate freely. Furthermore, by con- 
sidering the Himilar expression which would give the probability P{xq) 
for all states »(*(,) compatible with the mean value Xq for our variable, 
we can rewrite the above in the convenient form 

A(x)-A(Xa) 

P{x) = P(a:o)e , (141.28) 

where A(xq) is the free energy that would be ascribed to the system 
with the variable in question controlled at its mean value Xq. 

This formula is not suf&oient in general to lead to precise information 
as to the probabilities for fluctuations of a given extent *, since from 
a strict point of view P(a!) is the total probability for all states n(x) 
compatible with some physical control which could be introduced on 
our variable at the point x, and this is not necessarily the same as the 
probability for the definite value x of that variable. Thus, for example, 
returning to our previous illustration where the fluctuating variable is 
the volume » of a portion of the fluid, the symbol P{v) would denote the 
probability for all states n{v) of the fluid such that the volume of 
the portion in question would not exceed the value v. Nevertheless, the 
above formula is often sufficient to give ns the qualitative information 
that the probabilities for fluctuations of x away from its mean value Xq 
would be mainly determined by the exponential factor in (141.28), since 
for any appreciable difference between x and Xq the increase in free 
energy A{x)—A{asQ) would be very large compared with kT, and the 
m ai n effect would in any case be a cotxe^onding exponential decrease 
in probability as we go to larger fluctuations from the mean. Thus, for 
example, returning once more to our previous illustration, we realize 



EINSTEIN TREATMENT OF FLUCTUATIONS 


639 


§ 141 


that the vast majority of states in which the volume of the selected 
portion of fluid would not be greater than v would be ones in which the 
volume was nearly equal to v, and hence that the decrease in probability 
as we go to fluctuations of v farther and farther from Vq would be 
essentially determined by the exponential form of the expression given. 

In view of the foregoing discussion, it will now often be appropriate, 
following Einstein, to express the probability dP for finding our 
fluctuating variable within a definite range x to x~{-dx in the form 

A(x)—A(xa) 

dP=f(x)e'' dx, (141.29) 

where f(x) is whatever function of x that may be needed to make the 
equation valid, and A(x) and A(Xq) have their previous significance. 
Furthermore, it will often be plausible to assume that f(x) is only a 
gradually varying function of x and that the main dependence of the 
probability of a; is given by the exponential factor.f Hence, if we now 
regard the development off(x) around Xq, 


/(ic) =/(®o)+(^^ (141.30) 

it will often be allowable to rewrite (141.29) in the approximate form 

A(x)-A{x,) 

dP =f{x^e dx, (141.31) 

since this would be a good expression for the probabilities of fluctua- 
tions when x— X q is small, and the probabilities for fluctuations making 
x—Xq large become in any case so small that it is not then important 
to have a precisely correct form of expression for them. We are thus 
led to Einstein’s approximate expression for the probabilities of fluctua- 
tions in a macroscopic variable x, where f{Xg) is a constant and 
A(x) — A{xq) is the increase in free energy for the system as a whole 
which would correspond to a change in our conceptual macroscopic 
control on that variable from Xq to x. 

In applying this formula to actual cases it is often convenient to 
regard A(x) itself as developed around the value A{Xo) in the form 
A{x)—A(xo) 


=(^L< 


{x- 


, , 1 [d^A\ 


(a:— a:o)*H- 


1 /8^A\ 

3!lae®L, 


(aj— a:o)*-f-... . 
(141.32) 


t For example, when x is the volume of a portion of a fluid, and the external 
parameter corresponding to a in terms of which A is evaluated may be regarded as the 
upper limit of this volume, f{v) turns out to be proportional to pjJcT, where p is the 
pressure which the portion of fluid would exert at the volume e. This is indeed a slowly 
varying function of v. 



640 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


In substituting this expression into (141.31), we may always take 
{dAjdx)^^^ = 0 and drop the first term, since the free energy of an 
enclosed system at constant temperature is a minimum at equilibrium, 
and hence may be taken as a Tniuimum in the present situation at the 
mean value x = which from a thermodynamic point of view would 
be regarded as the value of x at equilibrium. Turthermore, in sub- 
stituting this expression it wiU then in general only be necessary to 

consider the first of the further terms for which does 

not vanish, since the later terms will be small in the region of small 
fluctuations which are the important ones to consider. 

It will be interesting to investigate the probabilities for fluctuations 
in the quite common case where the second derivative of the free energy 
does not vanish. We can then write (141.32) in the form 

A{z)-A{x,) « (141.33) 

Substituting in (141.31), we have 

dP (141.34) 

for the probability of a given fluctuation in x. With the help of (141.34) 
we may ihen compute the mean square fluctuation 


(a;— a;o)* = 


1 /0*A\ 

J (a— aip)® d(a!— ajp) 

— CO 

j“ d{x-Xo) 

— 00 


(141.36) 


where we integrate over all possible values of x-fX^, and do not need 
to feel concerned by the inaccuracy of our integrands in ranges where 
X — Xg is large since the contribution to the total integrals will in any 
case be small in such ranges. With the help of known formulae of 
integration (Appendix II) we thus obtain 


for the mean 
of (141.33), 


{X Xg)^ = ■ 


kT 


square fluctuation in the variable x, or, with the help 
U-Ao) = ihT (141.37) 


for the mean fluctuation in the corresponding jfiree energy. 

We have presented the foregoing account of the Eiinstein method, in 
spite of its not very rigorous character, because of its generality and 
ready applicability in giving an idea of the magnitude of fluctuations 



§ 141 EINSTEIN TREATMENT OF FLUCTUATIONS 641 

that can be expected in practical cases. For example, the above 
formula at once indicates that we can expect the mean kinetic energy 
per degree of jfreedom of any small part of an apparatus to fluctuate 
by a quantity of the order of \1cT at thermodynamic equilibrium. The 
method was actually devised by Sinstein in order to discuss the fluctua- 
tions in the density of a fluid in the neighbourhood of its critical point 
which give rise to the phenomenon of critical opalescence. We shall 
prefer to treat this problem, however, with the help of a discussion of 
the fluctuations that would be predicted for grand canonical ensembles 
to which we now turn. 

(e) Fluctuations in composition. In the foregoing parts of this section 
we have been concerned with fluctuations at thermodynamic equili- 
brium in quantities pertaining to a ‘closed’ thermodynamic system. 
These we have calculated on the assumption that the system of interest 
could be appropriately represented by a canonical ensemble. We miist 
now consider fluctuations at thermodynamic equilibrium in quantities 
— ^in particular in the composition — characterizing an ‘open’ thermo- 
dynamic system which can interchange component substances, with 
some other system or ‘reservoir’ with which it is connected in macro- 
scopic equilibrium. These we shall calculate on the assumption that 
the system of interest, when connected with a suf&ciently large reser- 
voir of material can be appropriately represented by a grand canonical 
ensemble. 

To make the desired calculation of the fluctuations to be expected 
in the amoimt of any component i contained in a system represented 
by a grand canonical ensemble, it is convenient to start with the 
expression (140.24) for the mean number of molecules of that kmd in 
the members of the ensemble 

Q+Mtni+...+/*jfcnA--.gin 

2 e ® (141.38) 

and consider the effect of changing to a neighbouring grand canonical 
distribution by making a small variation in the ‘potential’ for the 
component i of interest, without changing the remaining ‘potentials’ 
Ih. ••• H'hi ‘temperature’ 0, or the ‘external coordinates’ such 

as volume which determine the eigenvalues of energy jE?„. Such a 
change would correspond to a change in the system of interest ffom 
one condition of equilibrium to a neighbouring one of altered composi- 
tion but at the same temperature and with the same values for external 
coordinates such as volume. 

3595,26 4 



642 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


Applying the suggested vaiiation to (141.38), we obtain 


2 






= ^8Q- 






(141.39) 


In carrying out such a variation, however, we must retain the validity 
of the fondamentaJ equation 

2 e ^ = 1 (141.40) 

which ma>keB the total probability of finditig a member of the ensemble 
in one or another possible condition equal to unity. Hence the above 
variation must be carried out subject to the restriction 


0 = 


2 




e 



SQ . 


Combining (141.41) with (141.39), we have 


(141.41) 




8 


or 


(ng)-(^f)g 

(^<)* 


(141.42) 


This e 2 :pres 8 ion for fluctuations in composition is rigorous and general, 
in the form as written. For estimating the actual magnitude given by 
(141.42), however, it is often convenient to evaluate (^i/Sfj^) with the 
help of its thermodynamic analogue l/(8^</SAi). This procedure is in 
general justified whenever the fluctuations in composition are small 
enou^ so that this analogy retains the needed degree of validity, 
but can lead to incorrect results in the immediate neighbourhood of a 
point where (S/if/SNi) vanishes and its lack of continued proportionality 
to (Syxj/SWj) becomes all-important — e.g. when exceedingly close to the 
critical point of a fluid. 

Making the suggested change to thermodynamic language with the 
help of (140.32), and re-expressing the left-hand side of (141.42) by 
an obvious transformation, we then obtain, as a convenient expression 



§141 


FLUCTUATIONS IN COMPOSITION 


643 


for the practical computation of fluctuations in the amount of a com- 
ponent i of an ‘open’ system at equilibrium, 


_ ET 


(141.43) 


where B is the gas constant per mol, Nt is the number of mols of Ihe 
component i in the system that would he ass^ed on the >>njrig of 
thermodynamic considerations — i.e. 5?^ divided by Avogadro’s number 
— ^and {8fiJ8Ni) is the rate of change in the potential of that substance 
with its amount, corresponding to a change in composition of the 
system which leaves the potenMals of the other components, the tem- 
perature, and external parameters such as volume unaltered. 

This result now supplies a necessary link in justi^ong the use that 
we have made of the grand canonical ensemble to represent an ‘open’ 
thermodynamic system in a condition of equilibrium. As already noted 
in § 140 (a), we can expect the quantity {8fiJSNi) to be a poritive quan- 
tity greater than zero, except under fecial circumstances involving 
for example the existence of the component i in separate phases. 
Hence, in typical cases, we see from (141.43) that we can expect the 
fractional mean square fluctuations in composition to become n^ligible 
as we take s 3 rBtems containing a sufficiently large mean number of 
molecules Wi of the component under consideration. For example, in 
case the component i is one of a mixture of perfect gas^, we have 
found (140.20) that we can take 


Sfij _ BT 

which on substitution in (141.43) gives 


(141.44) 


(jj,)2 ^ 


(141.45) 


In accordance with the forgoing, the members of a grand canonical 
ftniitftnnhlft will in general have compositions highly concentrated about 
the mean composition in the typical cases where we should wish to use 
such an ensemble to represent a thermodynamic ^tem, and, appro- 
priate treatment can be given to the special circumstances where large 
fluctuations would he predicted, circumstances which ofben correspond 
to cases where there would be indeterminacy even from a thermo- 
dynanoic point of view. This, then, explains the validity of the thermo- 
dynamic procedure of treating ‘open’ systems as having in general a 



614 FUBTHER APFLIOATION8 TO THERMODYNAMICS Chap. XIV 

composition which, can he regarded as perfectly definite from a macro- 
scopic point of view. 

It is of interest to appreciate that the high concentration around the 
mean composition in the case of a typical grand canonical distribution 
makes its properties nearly the same as those of a petit canonical 
distribution for a composition taken the same as that mean. This, then, 
makes it unnecessary to give special attention to fluctuations in 
quantities other than composition, in the case of systems represented 
by a grand canonical distribution, since our previous study of such 
fluctuations with the help of petit canonical distributions will give us 
in general a reliable idea of what is to be expected. 

It is also of interest to note that our present quasi-thermodynamic 
method, of showing that we can in general expect the members of a 
grand canonical ensemble to be highly concentrated around a mean 
composition, can now be regarded as providing a new justification for 
the use that we made of the grand canonical distribution in § 114 as 
giving a good description of the equilibrium properties of a simple 
Einstmn-Bose or Fermi-Birac gas composed of a specified number of 
particles. 

Moreover, it will be of interest in this connexion to show that our 
previous non-thermodynamio discussion of the distribution of the mem- 
bers of a grand canonical ensemble around their most probable particle 
content, as given in §114(6), can now itself be translated into more 
thermodynando language. For this purpose we have only to note that 
our previous quantities C and a, used in (114.21) m setting up the 
grand canonical distribution, and the quantity y{n) defined by (114.24) 
can be re-expressed in our present terminology for a system of one 
component by 

0 «=-i, = (141.46) 

where .4(n) is the firee energy which would be ascribed to a system of 
n molecules at temperature T. Comparing with (114.27), we then 
obtain 

a+lin,-AW>- ji:(^),_,(»-«)*- 

P(n) = e 

(141.47) 

as an expression for the probability of finding a member of the ensemble 
cont ain i ng n molecules, where % is the most probable number of mole- 
cules. Furthermore, our previous approximate expression (114.30) for 



645 


§ 141 FLUCTUATIONS IN COMPOSITION 

the probability of fluctuations around the most probable number of 
molecules, when (8^Al8n\^f^ is not equal to zero and liig liftr derivatives 
can be neglected, can now be written in the form 


W* Wid^Aldn^)^- 

[Equation (141.47) can be used whenever the analogy A ^ ^ is 
sufficiently valid, and hence sometimes when the analogy fi 
breaks down, e.g. at the critical point of a fluid. When {S^Aldn^) is not 
too small both analogies hold, and equation (141.48) reduces to (141.43), 
with » « 1 » Nj^N and {S^Ajdn?) « Nj^ihfJLfhN). 

(/) Fluctuations in density of a fluid. As an important example of 
the application of fluctuation theory to empmcal problems, we may use 
our general formula (141.43) for mean square fluctuations in composi- 
tion to investigate the observationally interesting ease of fluctuations 
in the density of a single pure fluid. For this purpose let us consider 
a small portion of the fluid, located in some ^cified volume v, and 
containing in the mean % molecules, and let us treat this as being an 
‘open’ thermod 3 nnamic system which is in equilibrium with the rest of 
the fluid which serves as a large reservoir for accommodating fluctua- 
tions in. the amount of fluid in the ^cifled volume ». [Representing 
this small system by a grand canonical distribution, we can then write, 
in accordance with (141.43), 


{n—%)^ _ BT 
^)2 


(141.49) 


as an expression describing the fluctuations in the number of molecules 
in the system, where B is the gas constant per mol, N is the number 
of mols in the volume v from the point of view of thermod 3 mamics — 
i.e. W divided by Avogadro’s number — and 


Sfi _ 
SN~ 



(141.50) 


is the rate at which the Gibbs potential [i for the fluid m the little 
system would change with the number of mols N in it, hol ding its 
volume v and its temperature T constant. 

[Before Tnn.1ring use of the above formula for flnctuatioits we shall 
wish to (fliange it into a form similar to that whicfli has actually been 
employed in the interpretation of observations, and which is based on 
a knowledge of the equation of state for the fluid rather than of ft, N, v, 
and S’. For purpose we may make use of the deflmtion of potential 



646 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


pi in terms of free energy which was given by (140.17) in the preceding 
section. We can then write 

c • 

pi = — ^ pdv +pv (141.51) 


as an expression for the potential pi of the fluid in our little system, 
where the first term is the isothermal reversible work per mol necessary 
to change a portion of the fluid from its standard state of zero free 
energy where it has the volume per mol to the volume ^ per mol 
which it hae in our system, and the second term is the reversible work 
per mol necessary to introduce a, small portion of it into our system 
against the pressure p which it then has. Substituting 

« = (141.62) 


where v is the volume and N the nuipber of mols of fluid for our system, 
the above can be rewritten in the form 


vIN 

^ - J pdis+p^. 


(141.63) 


which by differentiation then gives 

Furthermore, we can evidently write 




(141.66) 


since, for any substance desciibable by an equation of state, the in- 
crease in pressure at constant temperature and volume for a given 
fractional increase in contents must be equal to the increase at constant 
temperature and. contents for the same fractional decrease in volume. 
Combining with the preceding equation, we then have 


ldp\ 


It is also convenient to note that we can evidently write 


(»— ^)? 
( 1)2 




(141.56) 


(141.67) 


where A/> is a fluctuation in the density of the fluid in our little system 
around its mean value p. 

Substituting (141.56) and (141.67) into (141.49), we can then write 



§141 


FLUCTUATIONS IN DENSITY 


647 


our expression for the fluctuations in density of a fluid in the fn.Tnnin.7- 
forms ^ yp 

W/ I. 





(141.68) 


W?) ‘ 

We at once note that such fractional fluctuations in density would 
increase rapidly as we consider smaller and smaller portions of the fluid. 
If the fluid in question is a perfect gas with 

P = (141.69) 


the above formula leads to the result 




(141.60) 


where % is the mean number of molecules of gas in the little volume 
that has been selected. The percentage fluctuations in density are 
hence very small unless the portion of gas considered is very small. 
Nevertheless, such fluctuations play the interesting role of providing 
small scattering centres for blue light which lead to the blue colour of 
the sky. 

For a fluid in the neighbourhood of its critical point the fluctuations 
predicted by (141.58) would become very large, siuce right at the 
critical point the flrst and second derivatives of pressure with respect 
to volume, for a given amount of fluid at the critical temperature, go 

This accounts in a qualitative way for the striking appearance of 
opalescence in an illuminated fluid as we approach the critical point, 
the scattering centres now becoming very important and large enough 
to scatter light of longer wave-length than the blue. This qualitative 
es^lanation of the phenomenon of critical opalescence, together with 
a quantitative discussion of its statistical-mechanical treatment, was 
first given by Smoluchowski.t 

To make a quantitative application of (141.68) to fluctuations in 
density in the neighbourhood of the criMcal point, we note, in accord- 
ance with (141.61), that the value of (dp/dv) in that neighbourhood 
would be closely given by 

t SmoluchoTvski, Ann, dar Fhga, 25, 206 (1908). 


(141.62) 



648 FURTHER APPLICATIONS TO THERMODYNAMICS Chap. XIV 


where (T—Tg) is the difiereace between the actual temperature and 
the critical temperature. Substituting in (141.68), we obtain 



(141.63) 


for the expected fluctuations in density p inside a specified volume v 
containing a portion of a fluid in the neighbourhood of its critical 
point.t 

A quantitative test of this formula was made by KeesomJ for the 
substance ethylene by using the scattering of light by the ethylene as 
providing a method of investigating the fluctuations in its density p. 
In the first place, by comparing the intensities of scattered light of 
a given wave-length over a range of different temperatures T (11-24- 
13*63“) in the neighbourhood of the critical temperature (11*18“ for the 
sample of not quite pui*e ethylene employed), it was possible to show 
that the value of (Ap/p)® was indeed quite closely inversely proportional 
to the distance from the critical temperature (T—Tg). In the second 
place, by showing that the ratio between the scattering powers for light 
of wave-lengths in the ne^hbourhood of the F and of the D Fraunhofer 
lines decreases as the critical temperature is approached, it was possible 
to conclude that the fluctuations become important in larger and. larger 
volumes v as (T—Tg) is decreased. Finally, by making a careful deter- 
mination of the absolute intensity of the scattered light for a particular 
wave-length (D) and at a particular temperature (11*73“), it was pos- 
sible to verify the actual magnitude of the fluctuations (Ap/p)® predicted 
by the formula (141.63), inserting therem an empirical value for the 
derivative {S^ldT&o)g. 

The empirical check on the theory of fluctuations in the density of 
a fluid is thus very satisfactory. Other familiar examples, where the 


t It will be observed that (141.63) would lead to the prediction of infinite fluctuations 
at the criticfiJ i)oint where T == T^, It must be remembered, however, that the evalua- 
tion of (Sft/Sn) in terms of the thermodynamic (BfilSN ) — on which the derivation was 
based — can fail when fluctuations in composition become too large. When (141.63) 
thus fails, it is possible to use the distribution described by (141.47) for calculating the 
fluctuations since the thermodynamic analogy remains valid as we approach 

the critical point firom hi^er temperatures. At the critical point, noting that the 
second and third derivatives of A with respect to n will vcmish, and using an appro- 
priate equation of state to evaluate the fomih derivative, this then leads to fluctuations 
( Ap/p)* of the order The failure of ^ /*i when fluctimtions in composition 
become too large is connected- with the necessity of correlating with in (140.30) 
and (140.31). The continued validity of A ^ ijt is connected with non-appearance of 
those quantities in (122.2) and (122.3). 
t Keesom, Ann. der Fhya. 35, 591 (1911). 



§141 


CONCLUSION 


649 


statistical-mecliamcal treatment of the fluctuations to be expected in 
a system at thermodynamic equilibrium has been quantitatively veri- 
fled, are provided by observations on the distribution of Brownian 
particles in a gravitational field, by observations on the linear displace- 
ments suffered by such particles as time proceeds, and by observations 
on the torsional oscillations of a small mirror suspended in a bigli 
vacuum. The observational existence and empirical measurement of 
the fluctuations, predicted by statistical mechanics, in quantities to 
which precise values are assigned from the standpoint of thermo- 
dynamics, provides important justification for our belief that statistical 
mechanics is indeed a more fundamental and more valid science than 
thermodynamics. 

142. Conclusion 

This now brings our long book on statistical mechanics to its end 
by completing our proposed discussion of the statistical-mechanical 
explanation of the principles of thermodynamics. It is pleasing to have 
come to this end with the above illustration, which shows so clearly the 
deeper and more powerful character of the newer science of statistical 
mechanics. 

Much has been omitted from the book in the way of considering 
important applications of statistical mechanics to problems of physical 
and chemical interest. In particular we may regret the omission of 
further and more specific consideration of the applications of statistical 
mechanics to studies on the actual rates at which physical-chemical 
systems proceed towards their conditions of equilibrium. This is a field 
in which much remains to be done and in which the methods of 
statistical mechanics are specially needed. 

It is hoped, however, that our exposition of theory has been suffi- 
ciently sound and complete to solve our main task of giving a true 
insight into the methods that must be employed when we wish to 
predict the behaviour of a mechanical system on the basis of less 
knowledge as to its actual state than would in principle be allowable 
and possible. Such partial knowledge of state is in reality all that we 
ever do have, and the discipline of statistical mechanics must always 
remain necessary, whatever changes we may have to make in pure 
mechanics itself from classical to quantum or to any future form. 


S59i;.25 


40 



APPENDIX I 


SYMBOLS FOR QUANTITIES, OPERATORS, AND MATRICES 

Conventions. Soaloer qmntities aie denoted by letters in italic or 
Greek type, e.g. A, a, a. Vector qmntities, which occur only rarely ia 
the text, are denoted by letters in Clarendon or bold-face type, e.g. A, 
E. Quantum mechanical operators, which occur firequently in the text, 
and a few other operators, are also denoted by letters in Clarendon or 
bold-face type, e.g. H, q, p. Matrices are either denoted with the help 
of double lines, e.g. |lJf||, or more frequently by an expression for the 
elements of the matrix, e.g. 

The complex conjugate of a quantity is denoted by an asterisk, e.g. A*. 
The mod/utm of a quantity is denoted by single lines, e.g. |A|. The 
mean v<d/ue, or the quantum mechanical expectation value, of a quantity 
pertaining to a single system is denoted by a single bar, e.g. Z. The 
mean vcthie of a quantily for t^ members of an ensemble of systems is 
denoted by a double bar, e.g. Z. The mode or most probable value of 
a quantity is denoted by a tilde, e.g. A. 

The same letter may be used to denote a gmmtmn mechanical ob- 
servable F, its eigenvcdues Aj,, its expectation vcdue F, the corresponding 
opercdor F, and the mat/rix |1J'|| composed of the dements F^j which 
can. be obtained with the help of that operator. 

Only those symbols are included in the following tables which occur 
frequently in the text with the signidoance given. Some of these same 
symbols appear in a limited context with a different significance. 


Scalar Quantities. 


a{h,t), a^jf) 

®2> ®S>”* 
Aj, Aj, A3,... 

A 

b 


b(r,t), b,.(t) 
e 




0 


generalized probability amplitude for state k at time t. 
external coordinates for a thermodynamic system, 
corresponding external forces, 
firee energy of a system, F—T8. 
critioal distance for a molecular collision, 
generalized probability amplitude for state r at time t. 
velocity of light. 

probability coefficient for energy eigenstate n: 
probability coefficient for unperturbed energy eigen- 
state Te at time t. 
constant. 

heat capacities at constant volume or pressure. 



APPENDIX I 


651 


E 

En 

f 

f(St-Pr) 

F 

,9 

9o 


9e 


9k 

9k 

0 


Go 

G;, 

Gk 

h 

H 

HO 

H 

n 


hji 

I 

J 

h 

I 

L 


base of natural logarithnis; charge of fundamental 
particle. 

energy of a system, 
energy eigenvalue for state n. 
unperturbed energy eigenvalue for state k. 
number of degrees of freedom of a system; fugacity. 
density of molecular distribution in /x-space. 
thermodynamic potential of a system, E+pv—TS. 
function of coordinates and momenta of a system, 
quantum weight for a degenerate molecular state, 
quantum weight for a; molecule of gas in ground 
level. 

quantum weight for a molecule of crystal in 
ground level. 

quantum weight for a degenerate molecular state k. 
number of molecular quantum states in a group of 
neighbouring states k. 

number of classical states — defined by equal volumes 
in the y-space — or number of quantum states corre- 
sponding to q)ecified condition of a system, 
number of quantum states for a crysfed at very low 
temperature. 

quantum weight for a d^enerate state k. 
number of quantum states for a system in a group of 
neighbouring states k. 

Planck’s constant. 

Hanultonian expression for energy, 
unperturbed Hamiltonian. 

Boltzmann fundion of the condition of a system. 
Gibbs function of the condition of a repr^ntative 
ensemble. 

^ 1 . 

indices specifying states, 
moment of inertia of diatomic molecule, 
rotational quantum number for a diatomic mole- 
cule. 

Boltzmann’s constant. 

equilibrium constant in tmms of partial pressures, 
quantum number for total angular momentum. 
Lagrai^ian function. 



652 


APPENDIX I 


m maas of particle or molecule; quantum number for 
a component of angular momentum, 
rest mass. 

J£|.j My, JKg components of angular momentTim for a system. 

V, number of molecules or other elements composing a 
system. 

'N number of systems in an ensemble; number of mols 
in a system. 

Avogadro’s number, 
p generalized momentum; pressure. 

Pa) i^y) Ps components of linear momentum for a molecule. 
jP, Fy, P„ components of linear momentum for a system. 

P probability, coarse-grained density in phase space. 

Pjf coarse-grained probability for state k. 

JJj probability for a group of neighbouring states k. 
q generalized coordinate. 

Q heat absorbed. 

r radius; number of degrees of freedom for a molecule. 
JR gajs constant per mol. 

«a) ^v> «« variables. 

iSx, Sy, Sg components of probability current density. 

S entropy; Hamilton’s characteristic function. 
t time. 

T temperature; kinetic energy; time interval. 

“(?!•••?/) eigensolution not dependent on time. 
u{k,q), v-jJiq) eigenfunction corresponding to a state k. 

^s(?i—?ji)) ?n) symmetric and antisymmetric eigenfunctions 

for a system of n similar molecules. 

Ug., Uy, Ug components of velocity. 

V volume; velocity; vibrational quantum number for 

a diatomic molecule 

v{r,q), eigenfunction corresponding to a state r. 

V potential energy; perturbation energy. 

W work done; Hamilton’s principal function. 

W{q,t), probability densities in coordinate or in momentum 

language at time t. 

#) probabiHfy for state n at time t. 

X molal fraction. 

X, y, z Cartesian coordinates. 

Z sum-over-states; number. 



663 


APPENDIX I 

rate of transition from condition k to v. 

ZH rate of classical collisions 

a parameter in molecular distribution laws. 

^ parameter in distribution laws, correlated witli 1 /kT. 
€ energy of a molecule, 
eo energy of a molecule iu ground-level, 
energy of a molecule in state k. 
energy of a molecule in condition k. 

6 distribution parameter in canonical ensembles, corre- 
lated with kT. 

0 characteristic temperature of a crystal, 
r, 9, (ft polar coordinates. 

R{r), 0(0), eigensolutions dependent on separated variables. 

K, A, ju, V indices specifying groups of quantum states. 

A wave-length. 

SA solid an^e defining a molecular collision. 

A heat of vaporization per mol. 
ft reduced mass. 

/X{ distribution parameter for «th component in grand 
canonical ensemble. 

fXi Gibbs potential for ith component of a thermo- 
dynamic system. 

V frequency. 

TT ratio of circumference to diameter of eirde. 
p density, fine-grained density in phase space. 
a symmetry number. 
ar„ density of eaetgy eigensolutions. 

Oj, ffy, <ig wave numbers. 

T time interval used in observation. 

probability amplitude in momentum languags at 
time t. 

’PiS} probability amplitude in coordinate language at 
time t. 

>/> distribution parameter in canonical ensemble, corre- 
lated with free energy A. 

8(0 infiTn‘fiwiTnn.1 range of internal coordinates and ex- 
ternal and internal momenta defining a moleonlar 
state. 

Q distribution parameter in grand canonical ensemble. 



364 


APPENDIX I 


Operators (quantum meohanioal). 

F, G operators oorxesponding to observables in general. 

H Hamiltonian operator. 

H® unperturbed Hamiltonian operator. 

Mjb, My, Mg operators corresponding to components of angular 
momentum. 

M® operator corresponding to square of total angular 
momentum. 

Pj., Py, Pg operators corresponding to components of linear mo- 
mentum. 

pj. operator corresponding to ibth generalized momen- 
tum. 

qjf operator corresponding to ^th generalized coordinate, 
s spin operator. 

V perturbation operator. 

Operators (miscellaneous). 

1 identity operator. 

P operator permuting indices. 

S unitary transformation operator to a diflEerent mode 
of quantum idechanioal representation. 

U(f) unitary transformation operator from i = 0 to < = 

2 summation over indices *. 

XJ multiplication over indices i. 

Matrix Elements. 

^nmy ®jim> ^^nm) corresponding to foregoing quantum mechanical 
operators. 

8js.{ Kronecker delta = ~ 

* (0 {k ^ 1), 

Ihi elements corresponding to identity operator 

_/l(& = Z) 

10 {k # 1). 

Sjif elements of transformation matrix to a different 
mode of quantum mechanical representation. 
t^(t) elements of time transformation matrix from # = 0 
to f = f. 



APPENDIX II 

SOME USEFUL FORMULAE 


(a) DePINITB iNTEaEAlS.t 


00 


J dx = r(»). 

0 

(1) 

r e-“®a:P dx = 

J flP« 

(2) 

0 ^ 

(3) 

0 ^ 

(4) 

0 ^ 

(5) 

J e X ax- Jo»+i' 

0 

(6) 

0 

(7) 

0 

(8) 

00 , 

J dx =-^. 

(0) 


0 


J e-a**a;8*!+i Ox = (10) 

0 

t See, for example, B, O. Peirce, Short TahU of Integrals, Ginn and Co., Boston, 



656 


APPENDIX II 


(6) DrEICHLET iNTEGEAl.t 

Qen&rcd Formula. 


I = j ... JJ ... dx^dx^ ... dx^ 


i. i. i r/^)r(iiV..r(-^' 

a^a^...a^ \pj \p. 
PiPa-Pn 




r»Pi Pi 


Pn 


i^+^+-+^+^) 

\Pl I>2 Pn 1 


( 11 ) 


when integrated over aJl poaUive values of x subject to the restriction 

\Pt 




+ — + (~) ^ 


(12) 


with ii, ia,..., i„; Oj, a^; Pi, Pa,..., Pn, and c greater than zero. 

Volume of EUipsoid. 

V = jjj dxdyda = fwoficr®, (13) 

when integrated over aU values of x, y, z inside the ellipsoid 


4- 00 


(c) Fotoibb Integral.^ 

For any continuous function /(») for which the integral J |/(a:)| dx 
exists, we can write 

/(«) = J fl'(y)e-^®’' dy, ( 15 ) 


with 


+ 00 


( 16 ) 


Similarly for a continuous function of several variables /(x^ ... ®„) for 

+ 00 +00 

which the integral J ... J lf(xi ... x„)l dsi ... dx^ exists we can write 
— 00 —00 
+ 00 +00 

/(»! ... ajJ = J ... J g(p^ ... ... dy^, (17) 

— 00 —00 

th ^00 ^eo 

Vn) = J^a J - J /(®1 - ... dx^. (18) 


— 00 —00 


t See, for example, Edwards, Integral Cal(yuhi8, Macinillan, London, 1922, vol. ii, 
chapter xxv. 

t See, for example, Oourant and Hflbert, Methoden der rnoiihmiaHechm PTvyaik, 
Springer, Berlin, 1924, chapter ii, §6. 



SUBJECT INDEX 


Adiabatic principle in quantum theory, 
414. 

Adiabatic processes, 541. 

Antissmometrio solutions, 313 fP. 

A priori phases in quantum mechanics, 349. 

A priori probabilities: in classical mecha- 
nics, 59 ; in quantum mechanics, 349, 
423. 

Average, meaning of term, 2. 

Averages: in classical ensembles, 47; in 
classical MaxwelL-Boltzmann distribu- 
tion, 90; in quantum mechanics, 208; 
in quantum mechanical ensembles, 
329. 

Boltzmann ratio, 519. 

Boltzmann’s constant, 88, 378, 574. 

Canonical coordinates and momenta: in 
classical mechanics, 32; in quantiun 
mechanics, 193. 

Canonical ensepible: in classical mecha- 
nics, 58 ; in quantum mechanics, 347 ; 
as representing equilibrium for system 
in heat bath or in essential isolation, 
501 :S.; as representing neighbouring 
conditions of equilibrium, 533 ; as 
representing thermodynamic equili- 
brium, 530 £P. ; justification for, 503. 

Canonical equations of motion, 27. 

Canonical transformations, 32. 

Carnot cycle, 566. 

Change with time : in classical quantities, 
27 ; in density in phase space, 48 ; in 
density matrix, 335; in expectation 
values, 237; in quantum mechanical 
ensembles, 450; in quantum mecha- 
nical systems, 395 fi. ; regarded as a 
unitary transformation, 405 ; resulting 
from classical collisions, 99 fi. ; result- 
ing from quantum mechanical colli- 
sions, 436; resulting firom time pro- 
portional transitions, 424. 

Classical limit, approach of quantum 
mechanical behaviour, 243. 

Classical mechanics, 16 fP. 

Collisions : application of conservation 
laws, 120; application of Liouvfile’s 
theorem, 121 ; as a classical mechanism 
of change with time, 99 ff.; closed 
cycles of, 114; in classioal gases, 110; 
in Fermi-Dirac and Einstein-Bose 
gases, 436 ; probability coef^cientsfor, 
127. 

8595.25 4 


Commutation rules for coordinates and 
momenta, 205. 

Commutator for operators, 203. 

Complementarity: idea of, 187, 227; re- 
strictions on observations, 416. 

Condition of a S3^stem: in classical mecha- 
nics, 74 fP.; in quantum mechanics, 
364 £P. 

Conduction electrons, 392, 519. 

Configuration space, 45. 

Conjugate variables, 27. 

Conservation: of density and extension in 
phase, 50, 51 ; of energy, 28, 240, 528 ; 
of linear and angular momentum, 30, 
241 ; of quantum mechanical proba- 
bility, 218. 

Continuity of path, 65. 

Convergence in the mean, 256. 

Coordinates: classical, 17; cyclic, 37; 
external, 525; stationary, 18; quan- 
tum mechanic^ 193. 

Correspondence princij^e, 187, 243. 

C^cle of corresponding collisions: in 
general, 114; for spherical molecules, 
117. 

De Broglie waves, 228. 

Degrees of fireedom, 18. 

Density in phase space, 45 fi. ; fine-grained 
and coarse-grained, 166, 167; in- 
varicmce to transformation, 52; time 
dependence, 48. 

Density matrix, 327 fP. ; for pure and mixed 
states, 333; relation to fine-grained 
and coarse-grained probabilities, 459; 
time dependence, 335; transformation 
of, 330. 

Detailed balance: in classical mechanics, 
165 ; in quantum mechanics, 521. 

Distribution parameters, 58. 

Dynamical reflectability, 104. 

Dynamical reversibility: in classical 
mechanics, 102; in quantum mecha- 
nics, 395. 

EfPect: of essential isolation, 540; of 
interaction in general, 549; of irre- 
versible adiabatic processes, 547 ; of 
reversible adiabatic processes, 542; of 
thermal processes, 553. 

Eigenfunctions: nature of, 224, 247, 248; 
completeness of set, 226, 256; corre- 
sponding to characteristic states, 247 ; 
degeneracy of, 225, 248; enumeration 

P 



658 


SUBJECT INDEX 


Bigenfanctions — (cont > ) 

of, 321; expansions in terms of, 254; 
for angular momentum, 294 fi.; for 
coordinates and momenta, 253; for 
particle in a central field, 299 ; for par- 
ticle in a Hooke’s law field, 291 ; for 
particle in free space, 286; for par- 
ticles with spin, 306 fi. ; for systems of 
like particles, 312 fi. ; normalization of, 
226, 250 ; orthogonality of, 225, 249 ; 
symmetric and antisymmetric, 313 fi. 

Eigenvalues: nature of, 224, 246, 248; for 
angular momentum, 294 fi.; for co- 
ordinates and momenta, 253; for 
ensrgy, 224; for particle in a central 
field, 299 ; for particle in a container, 
287; for particle in a Hooke’s law 
field, 291; for unperturbed energy, 
274, 400 ; spectrum of, 224, 248. 

Einstein-Bose systems, 381 fi., 609 fi. 

Energy: conservation of, 28, 240, 528; 
eigenvalues of, see above; equiparti- 
tion of, 93 ; integral of, 28 ; kinetic and 
potential, 21; lev^, 181, 224; mean 
value for oscillators, 378; relation 
to sum-over-states, 567; statistical 
mechanical analogue, 526. 

Ensemble of s 3 rstems: in classical mecha- 
nics, 43fi.; in quantum mechanics, 
325 fi.; canonical, 58, 347; micro- 
canonical, 57, 345; pure and mixed 
states, 333; representing a sjnstem 
of interest, 62 ; representing a thermo- 
d 3 mamic system, 524 ; sinface, 57 ; uni- 
form, 56, 342. 

Entropy: general discussion, 561 ; relation 
to sum-over-states, 567; statistical 
mechanical analogue, 535 ft. 

Entropy of mixing, 598. 

Enumeration of eigensolutions, 321. 

Equations of motion: canonical, 27; 
Hamiltonian, 27; Lagrangian, 23; 
Newtonian, 25; quantum mechanical 
analogue, 239. 

Equilibrium: between connected Bystema, 
613 S,; in Maxwell-Boltzmann sys- 
tems, 71 ft. ; in Maxwell-Boltzmann, 
Einstein-Bose, and Fermi-Dirac sys- 
tems, 362 ft.; in physical-chemical 
systems, 519; in thermodynamic 
S 3 rstems, 530; statistical mechanical, 
55, 339. 

Equipartition: general principle of, 96 ft.; 
of energy, 93. 

Ergodic hsqwthesis, 65, 358. 

Essential isolation, 498, 501, 540. 

Essentially adiabatic processes, 498, 541 fi. 


Expansions: in terms of eigenfunctions, 
254 ; in terms of energy eigenfunctions, 
259 ; in terms of unperturbed energy 
eigenfunctions, 274, 400; of probabi- 
lity amplitudes, 257 ft. 

Expectation values: general treatment, 
195 ft. ; change with time, 237. 

Extension in phase: conservation of, 51; 
dimensions of, 54 ; invariance to trans- 
formation, 52. 

Fermi-Dirac systems, 388 ft., 509 ft. 

Field, idea of, 19. 

Fine- and coarse-grained densities in 
classical phase space, 166, 167. 

Fine- and coarse-grained probabilities in 
quantum mechanics, 459, 460. 

First law of thermodynamics, 529. 

Fluctuations: Einstein’s formula, 636; in 
composition, 641; in density of a 
fiuid, 645 ; in Einstein-Bose distribu- 
tions, 512, 630; in energy, 631; in 
external forces, 635; in Fermi-Dirac 
distributions, 513, 630; in Maxwell- 
Boltzmann distributions, 509, 630. 

Free energy: relation to other thermo- 
dynamic quantities, 565; statistical 
mechanical analogue, 535 fi. 

Fugaoity, 602. 

Gas thermometer, idea of, 86, 376. 

Generating functions for canonical trans- 
formations, 33, 36. 

Gibbs paradox, 626. 

Gibbs potentials, 617. 

Grand canonical ensemble, 511, 619. 

y-space, 45. 

Hamiltonian function, 26; interpretation 
as energy, 28. 

Hamiltonian operator, 211. 

Hamilton-Jacobi partial difierential equa- 
tion, 38. 

Hamilton’s characteristic function, 38. 

Hamilton’s equations of motion, 27 ; quan- 
tum mechanical analogue, 239. 

Hamilton’s principal function, 40. 

Hamilton’s principle, 19 fi. ; modified, 35. 

Heat, statistical mechanical analogue, 526. 

Heat capacity : for simple classical systems, 
93; of conduction electrons, 394; of 
crystals, 380, 590; of monatomic 
gases, 573 ; of more complicated gases, 
676; relation to sum-over-states, 567. 

ff : definition, 134, 453 ; continued decrease, 
148; rate of chcmge with time, 136, 
465; relation to 174, 462. 

S : definition, 166, 459 ; rate of change with 



SUBJECT INDEX 


659 


time, by classical processes, 170; by 
quantum mechanical transitions, 463 ; 
by quantum mechanical processes in 
general, 466 

H-theorem : in classical mechanics, 134 ; 

in quantum mechanics, 453 3.; ap- 
plication, to equilibriiun, 169, 480; to 
interacting systems, 477; to isolated 
systems, 482, 486 ; to systems in con- 
tact TTith surroimdings, 488, 601; 
generalization by Gibbs, 165 ; relation 
to continued fluctuations, 166; to 
dynamical reversibility, 162; statis- 
tical character, 146. 

Ideal solution, 602. 

Indeterminacy, 187. 

Integrating factor for heat, 663. 

Invariance of density and of extension in 
phase, 62. 

Irreversible adiabatic processes, 647. 

Klein relation, 468. 

Lagrange's equations of motion, 23. 

Lagrangian fimction, 19. 

Lemma: on plogp— plogP— p+P, 169; on 
^ 660. 

Liouville's theorem : in classical mechanics, 
48 ; in quantum mechanics, 336. 

Magnetic susceptibility, 394, 519. 

Matrices corresponding to observables, 265. 

Matrix mechcmics, 266. 

Maxwell-Boltzmann distribution: in clas- 
sical mechanics, 71 fl.; in quantum 
mechanics, 362 fl., 606 ; for molecules 
of more than one kind, 82; mean 
values from, 90 ; various forms of ex- 
pression for, 88. 

Maxwell-Boltzmann, Einstein-Bose, and 
Fenni-Dirac distributions, 362 fl. 

Maxwell’s distribution law for velocities, 
89. 

Mechanics: ekwssical, 16 ff.; quantum, 
180 ff. ; correspondence between two, 
237 fl. 

Method of variation of constants, 273, 400. 

Method of Wentzel, Kramers, and Bril- 
louin, 282. 

Microcanonical ensemble: in classical 
mechanics, 68; in quantum mecha- 
nics, 346 ; as representing eqiiilibrium, 
71 fiE., 362 ff., 486. 

Microscopic reversibility, 163. 

Molecular collisions, 110. 

Molecular constellations, 108. 


Molecular states, 105. 

Momenta, generalized, 26. 

Momentum, conservation of linear cmd 
angular, 30, 241. 

Momentum space, 46. 

jLt-space, 46. 

Nearly steady states, 273. 

Newton’s equations of motion, 25. 

Notation, 13, 326, 650. 

Operators: general discussion, 199; for 
angular momentum, 293 ; for change 
with time, 405; for coordinates and 
momenta, 204; for spin, 306; Hamil- 
tonian, 211; Hermitian, 202; linear, 
201 ; perturltetion, 274, 400; quantum 
mec h anical, 206; transformed, 263; 
unperturbed Hanultoman, 274, 400. 

Particles : in a central fleld, 292 ; in a con- 
tainer, 287; in a Hooke’s law field, 
291 ; in free space, 285; like, 312 ; two 
interacting, 299; with spio, 306. 

Pauli exclusion principle, 321. 

Phase point, 44. 

Phase space, 43 fl. 

Phases, specific and generic, 627. 

Poisson brackets, 27, 238, 267, 339. 

Postulatory treatment, 10. 

Pressure, ration to sum-over-states, 567. 

Probabilities for molecular collisions: In 
classical mechanics, 127; in quantum 
mechanics, 444. 

Probability amplitudes: for coordinates 
and momenta, 190; interrelation, 
193; transformed, 261. 

Probability current, 219. 

Probability density, 189, 218. 

Probability of different conditions, 78, 370- 

Probability of phase, 47. 

Probability waves, 221. 

Qiiantmn mechanics, 180 ff.; generalized 
language, 271 ; relation to old quan- 
tum theory, 284; sununaiy of poetu- 
latory basis, 217. 

Quasi-eigodio hypothesis, 70. 

Badiation, 380, 382. 

Bandom phases, 344, 349, 423. 

Bepresentative ensembles, 62, 524. 

Bepresentative point, 43. 

Beversible adiabatio processes, 542. 

Beversible thermal transfer, 555. 

Beversibflity, dynamical: in classical me- 
chanics, 102; in quantum mechanics, 
395. 



6^8 


SUBJECT INDEX 


Eigenfunctions — (coni. ) 

of, 321; expansions in terms of, 254; 
for angular momentum, 294 fi.; for 
coordinates and momenta, 253; for 
particle in a central jddLd, 299 ; for par- 
ticle in a Hooke’s law field, 291 ; for 
particle in free space, 286; for par- 
ticles with spin, 306 fr. ; for systems of 
like particles, 312£t. ; normalization of, 
226, 250; orthogonality of, 225, 249; 
S3nnmetric and antisymmetric, 313 fi. 

Eigenvalues: nature of, 224, 246, 248; for 
angular momentum, 294 ff.; for co- 
ordinates and momenta, 253; for 
energy, 224; for particle in a central 
field, 299 ; for particle in a container, 
287; for particle in a Hooke’s law 
field, 291; for unperturbed energy, 
274, 400; spectrum of, 224, 248. 

Einstein-Bose S 3 ;st 6 ms, 381 fr., 509 fr. 

Energy: conservation of, 28, 240, 628; 
eigenvalues of, see above; equiparti- 
tion of, 93 ; integral of, 28 ; kinetic and 
potential, 21; levels, 181, 224; mean 
value for oscillators, 378; relation 
to sum-over-states, 567; statistical 
mechanical analogue, 526. 

Ensemble of s 3 rstems: in dassioal mecha- 
nics, 43 fi.; in quantum mechanics, 
325 fr.; canonical, 58, 347; micro- 
canonicaJ, 57, 346 ; pure and mixed 
states, 333; representing a system 
of interest, 62 ; representing athermo- 
dynamic system, 524 ; surface, 57 ; uni- 
form, 56, 342. 

Entropy: general discussion, 561 ; relation 
to sum-over-states, 567; statistical 
mechanical analogue, 535 fi. 

Entropy of mixing, 598. 

Enumeration of eigensolutions, 321. 

Equations of motion: canonical, 27; 
Hamiltonian, 27; Lagrangian, 23; 
Newtonian, 25; quantum mechanical 
analogae, 239. 

Equilibrium: between comeoted systems, 
613 ff.; in Maxwell-Boltzmann sys- 
tems, 71 fi.; in Maxwell-Boltzma^, 
Einstein-Bose, and Fenni-Dirac sys- 
tems, 362 fi.; in physical-chemical 
systems, 519; in iiieimod 3 mamic 
S 3 rstems, 530; statistical mechanical, 
55, 339. 

Equipartition: general principle of, 96 fi.; 
of energy, 93- 

Ergodic hypothesis, 65, 358. 

Essential isolation, 498, 501, 540. 

Essentially adiabatic processes, 498, 541 


Expansions: in terms of eigenfunctions, 
254 ; in terms of energy eigenfunctions, 
269 ; in terms of imperturbed energy 
eigenfunctions, 274, 400; of probabi- 
lity amplitudes, 257 fii. 

Expectation values: general treatment, 
196 ff. ; change with time, 237. 

Extension in phase: conservation of, 51; 
dimensions of, 54 ; invariance to trans- 
formation, 52. 

Fermi-Dirao systems, 388 ff., 509 ft. 

Field, idea of, 19. 

Fine- and coarse-grained densities in 
classical phase space, 166, 167. 

Fine- and coarse-grained probabilities in 
quantum mechanics, 459, 460. 

First law of thermodynamics, 629. 

Fluctuations: Einstein’s formula, 636; in 
composition, 641; in density of a 
fiuid, 645; in Einstein-Bose distribu- 
tions, 612, 630; in energy, 631; in 
external forces, 636; in Fermi-Dirao 
distributions, 513, 630; m Maxwell- 
Boltzmann distributions, 509, 630. 

Free energy: ration to other thermo- 
dynamic quantities, 565; statistical 
mechanical analogue, 535 fr. 

Fugacity, 602. 

Gas thermometer, idea of, 86, 376. 

Generating functions for canonical trans- 
formations, 33, 36. 

Gibbs paradox, 626. 

Gibbs potentials, 617. 

Grand canonical ensemble, 511, 619. 

>/-spaGe, 45. 

Hamiltonian function, 26; interpretation 
as energy, 28. 

Hamiltonian operator, 211. 

Bamilton-Jacobi partial difierential equa- 
tion, 38. 

Hamilton’s characteristic function, 38. 

Hamilton’s equations of motion, 27 ; quan- 
tum mechanical analogue, 239. 

Hamilton’s principal function, 40. 

Hamilton’s principle, 19 fr. ; modified, 35. 

Heat, statistical mechanical analogue, 526. 

Heat capacily : for simple classical systems, 
93; of conduction electrons, 394; of 
crystals, 380, 590; of monatomic 
gases, 573 ; of more complicated gases, 
576 ; relation to sum-over-states, 667. 

H : definition, 134, 453 ; continued decrease, 
148; rate of chcmge with time, 136, 

^ 455; relation to H, 174, 462. 

H : definition, 165, 459 ; rate of change with 



SUBJECT INDEX 


659 


time, by classical j)rocesses, 170; by 
quantum mecbazxical transitions, 463 ; 
by qiiantum mechanical processes in 
general, 466 

H-theorem : in classical mechanics, 134 fE. ; 
in quantum mechanics, 453 fE.; ap- 
plication, to equilibrium, 169, 480 ; to 
interacting systems, 477 ; to isolated 
systems, 482, 486 ; to systems in con- 
tact with surroTmdings, 488, 601; 
generalization by Gibbs, 165; relation 
to continued fluctuations, 165; to 
dynamical reversibility, 162; statis- 
tical character, 146. 

Ideal solution, 602. 

Indeterminacy, 187. 

Integrating factor for heat, 663. 

Invaxiance of density and of extension in 
phase, 62. 

Irreversible adiabatic processes, 647. 

Klein relation, 468. 

Lagrange’s equations of motion, 23. 

Lagrangian function, 19- 

Lemma: on plogp-plogP^p-i-P, 169; on 

S+Sfe, 660 . 

Liouville’s theorem : in classical mechanics, 
48 ; in quantum mechanics, 336. 

Magnetic susceptibility, 394, 619. 

Matrices corresponding to observables, 266. 

Matrix mechanics, 266. 

Maxwell-Boltzmann distribution: in clas- 
sical mechanics, 71 ff.; in quantum 
mechanics, 362 ff ., 606 ; for molecules 
of more than one kind, 82; mean 
values from, 90; various forms of ex- 
pression for, 88. 

Maxwell-Boltzmann, Einstein-Bose, and 
Eermi-Dirac distributions, 362 ff . 

Maxwell’s distribution law for velocities, 
89. 

Mechanics: classical, 16 ff.; quantum, 
180 ff.; correspondence between two, 
237 ff. 

Method of variation of constants, 273, 400. 

Method of Wentzel, Kramers, and Bril- 
louin, 282. 

Microcanonical ensemble : in classical 
mechanics, 68; in quantum mecha- 
nics, 346 ; as representing equilibrium, 
71 ff., 362 ff., 486. 

Microscopic reversibility, 163. 

Molecular collisions, 110. 

Molecular constellations, 108. 


Molecular states, 106. 

Momenta, generalized, 26. 

Momentum, conservation of linaar and 
angular, 30, 241. 

Momentum space, 46. 

p,-space, 45. 

Nearly steady states, 273. 

Newton’s equations of motion, 25. 

Notation, 13, 326, 660. 

Operators: general discussion, 199; for 
angular momentum, 293 ; for change 
with time, 406; for coordinates and 
momenta, 204; for spin, 306; Hamil- 
tonian, 211; Hermitian, 202; linear, 
201 ; perturl^tion, 274, 400; quantum 
mechanical, 206; transformed, 263; 
unperturbed Hamiltonian, 274, 400. 

Particles: in a central field, 292; in a con- 
tainer, 287; in a Hooke’s law field, 
291 ; in free space, 286; like, 312 ; two 
interacting, 299 ; with spin, 306. 

PauH exclusion principle, 321. 

Phase point, 44. 

Phase space, 43 ff. 

Phases, specific and generic, 627. 

Poisson brackets, 27, 238, 267, 339. 

Postulatoiy treatment, 10- 

Pressure, relation to sum-over-states, 667. 

Probabilities for molecular collisions: in 
classical mechanics, 127 ; in quantum 
mechanics, 444. 

Probabflity amplitudes: for coordinates 
and momento, 190; interrelation, 
193; transform^, 261. 

Probability current, 219. 

Probability density, 189, 218. 

Probability of different conditions, 78, 370- 

Probability of phase, 47, 

Probability waves, 221. 

Quantum mechanics, 180 ff.; generalized 
language, 271 ; relation to old quan- 
tum theory, 284; summary of postu- 
latory basis, 217. 

Quasi-ergodic hypothesis, 70. 

Radiation, 380, 382. 

Random phcusies, 344, 349, 423. 

Representative ensembles, 62, 624. 

Representative point, 43. 

Reversible adiabatic processes, 642. 

Reversible thermal transfer, 566. 

Reversibility, dynamk>al: in classical me- 
chanics, 102; in quantum meo ban i c s, 
395. 



660 


SUBJECT INDEX 


Sehroedinger equation : including the time, 
209 ; without the time, 223 ; examples, 
212 ; integration as a Taylor’s series, 
403 ; integration by method of varia- 
tion of constants, 27 6, 400 ; integration 
when an external parameter is varied, 
409; solution in regions of constant 
potential, 278 ; solution in regions of 
varying potential, 282; transformed, 
216, 264. 

Second law of thermodynamics, 668. 

Spaces, y-, ft-, configuration, momentum, 
pha^, velocity, 46. 

Specification of condition of a system : in 
classical mechanics, 74; in quantum 
mechanics, 364, 416. 

Spectrum: of energy levels, 224; of eigen- 
values in genei^, 248. 

Spin, 306 ff. 

States of a molecule in classical mechanics, 
106 ff. 

States of a system: classical, 17, 77 ; quan- 
tum mechanical, 192, 194; accessible, 
322; characteristic, 246; character- 
istic of more than one observable, 261 ; 
steady, 222; superposition and de- 
composition of, 221 ; symmetric and 
antisymmetric, 313 

Statistical equilibrium: in classical me- 
chanics, 66; in quantum mechanics, 
339. 

Statistical mechanical analogues: of en- 
ergy, work, and heat, 626 ; of entropy, 
temperature, and free energy, 635 ff . ; 
of Gibbs potentials, 623. 

Statistical mechanics: classical, 4, 43 fr.; 
quantum mechanical, 6, 326 fi. ; and 
thermodynamics, 9, 624 fr.; deduc- 
tive and inductive approach, 3 ; fun- 
damental hypothesis in classical 
development, 69 fr. ; fundamental 
hypothesis in quantum development, 
349 fP.; nature of, 1; validity of, 
63 ff., 366 fr. 

Steady state of a system, 222. 

Sum-over-states: definition, 632; depen- 
dence on molecular states, 668 ; rela- 
tion to thermodynamic quantities, 667. 

Superposition, principle of, 220. 

Symmetric solutions, 313 fr. 

System: definition, 43; conservative, 20; 
elements composing, 43, 362; holono- 
mic, 18; mechanical, 21; non-conserva- 
tive, 20; non-holonomic, 18, 24; non- 
mechanical, 22 ; of interest and repre- 
sentative ensemble, 2, 62, 326, 624; 
state of, in classical mechanics, 17, 19 ; 


state of, in quantum mechanics, 192, 
194. 

Systems in contact with surroundings: 
long time behaviour, 488; resulting 
equilibrium, 501. 

Systems in perfect isolation: long time be- 
haviour, 482; resulting equilibrimn, 
486. 

Temperature: general discussion, 563; in- 
troduction from properties of a perfect 
gas, 86, 376; statistical mechanical 
analogue, 636 fP. 

Thermal equalization, 498. 

Thermal equilibrium, 664. 

Thermal flow, direction of, 651. 

Thermal interaction, 663. 

Thermal transfer, 663. 

Thermodynamic potential, 604 fi. 

Thermodynamic properties; of crystals, 
683; of gases, 572, 674; of gaseous 
mixtures, 696; of liquid mixtures, 
600; of mixed crystals, 598. 

Thermodynamic quantities: in terms of 
free energy, 666; in terms of sum- 
over-states, 567. 

Thermodynamic system and representa- 
tive ensemble, 624. 

Thermodynamic treatment: of chemical 
equilibria, 609; of ‘open’ systems, 
613 £P. ; of vapour pressures, 606. 

Thermod 3 mamic variables, 624. 

Third law of thermodynamics, status of, 261 . 

Time-proportional collision probabilities, 
444. 

Time-proportional transitions in general, 
424. 

Transformation functions, 194, 261, 262. 

Transformation matrices, 269. 

Transformation theory, 261 fP. 

Uncertainty principle, 186, 232, 234. 

Uniform ensemble : in classical mechanics, 
66 ; in quantum mechanics, 342. 

Unitary transformations: between differ- 
ent quantum mechanical representa- 
tions, 268 ; for change with time, 405. 

Vapour pressures, 606. 

Variation of constants, 273, 400. 

Velocities, generalized, 17. 

Velocity space, 45. 

Wave mechanics, 221, 279. 

Wave packets, 231. 

Wave-paarticle duality, 182, 226. 

Work, statistical mechanical analogue, 626. 

Zustandsumme, 532. 



Badger, vii. 

Beattie, 591. 

Bell, vii. 

Bethe, 394. 

Birge, 88, 574. 

BirkhofP, 70. 

Bloch, 394. 

Bohr, 181, 184, 187, 307, 
488. 

Boltzmann, 52, 65, 108, 142, 
158. 

Bom, 36, 181, 266. 

Bose, 382. 

Bothe, 183. 

Brillouin, 243, 394, 519. 
Brody, 54. 

Brose, 634. 

Burbury, 152, 

Clayton, 593. 

Compton, 183. 

Courant, 656. 

Davisson, 184. 

De Broglie, 183. 

Debye, 181, 380, 585. 
Dickinson, vii. 

Dirac, 239, 261, 273, 307, 
308, 327. 

Dushman, 92. 

Bdwards, 656. 

Ehrenfest, 45, 70, 142, 152, 
157, 166, 179, 240, 414. 
Einstein, 181, 182, 380, 464, 
631, 636. 

Epstein, 179, 601, 627. 

Eeimi, 454. 

Eierz, 472, 474. 

Fowler, vii, 12, 43, 66, 69, 
81, 166, 322, 391, 394, 
481, 580, 583, 696, 608, 
612, 636. 

Franck, 182. 

Geiger, 183. 

Germer, 184. 


NAME INDEX 

Gianque, 582,-593, 594. 
Gibbs, vii, 1, 14, 15, 43, 65, 
69, 168, 178, 603, 604, 
606, 526, 614, 620, 633, 
636. 

Gibson, 594. 

Goudsmit, 306. 

Guttinger, 409. 

Btemilton, 42. 

Bieisenberg, 184, 266, 312. 
Bfeitler, 594. 

Biertz, 182. 

Bjlbert, 656. 

BUdebiand, vii, 601. 
Bionston, vii, 394. 

Isbiwara, 182. 

Jacobi, 42. 

Johnston, 594. 

Jordan, 261, 266, 444. 
Jnttner, 94. 

Eeesom, 648. 

Kemble, 234. 

Kennard, vii. 

Klein, 461, 468. 

Koopman, 70. 

Landau, 191. 

Langer, 283. 

Larmor, 214. 

Lewis, 519, 602. 
liouville, 49. 

Lorentz, 120, 603. 
Loschmidt, 152. 

McCoy, 207. 

Massey, 305, 387. 

Maxwell, 65. 

Mayer, 619. 

Mohr, 387. 

Mott, 305, 312. 

MuU^en, 292. 

N'ordheim, 394. 


Oppenheimer, viii, 312. 
Overstreet, 582, 593. 

Pauli, vii, 11, 211, 220, 224^ 
227, 248, 283, 293, 299, 
307, 394, 406, 420, 431, 
455, 461, 466, 472, 474, 
503, 513, 631. 

Paulii^, vii, 291, 296, 593, 
594. 

Peierls, 191, 394. 

Peirce, 655. 

Planck, 181, 532, 634. 
Podolsky, 211. 

Poincare, 54, 155. 

Bobertson, vii, 234, 235, 252. 
Buark, 191. 

Schroedinger, 42, 210, 224, 
234, 282, 358. 

Simon, 183. 

Smoluchowski, 647. 
Soxnmeifeld, 182, 388, 394. 
Sterne, 608. 

Stout, 593. 

Sturdivant, vii. 

Tolman, 94, 95, 97, 146, 163. 

Uehling, 387. 

UHenbeci, 306, 387. 

Von K4nn4n, 181, 380. 

Von Neumann, 70, 257, 327. 

Wentzel, 243, 394. 

Weyl, 207, 235, 262. 
Whittaker, 30, 584. 

Wilson, A. BL, 394. 

Wilson, E. Blight, vii, 291, 
296. 

Wilson, Edwin B., 1. 
Wilson, W., 182. 

Yost, vii. 

Zennelo, 155. 



PBmTBD IN’ 
GEBAT BRITAm 
AT THE 

TTNITBESITY PRESS 
OXFORD 
BY 

JOHN JOHNSON 
PRINTER 
TO THE 
TTNIYERSITY 






